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CROSS-REFERENCE TO RELATED APPLICATIONS 

This application claims priority to United States Provisional Patent Application Serial No. 
06/193,839, entitled EPIGENETIC SEQUENCES FOR ESOPHAGEAL ADENOCARCINOMA, 
filed 31 March 2000. 

STATEMENT REGARDING FEDERALLY-SPONSORED RESEARCH 

This work was supported by NIH/NCI grant R01 CA 75090 to P.W.L. The United States 
has certain rights in this invention, pursuant to 35 U.S.C. § 202(c)(6). 

TECHNICAL FIELD OF THE INVENTION 

The present invention provides a diagnostic or prognostic assay for gastrointestinal 
adenocarcinoma, and particularly esophageal adenocarcinoma ("EAC"). Specifically, the present 
invention provides a multi-geneic epigenetic fingerprint or methylation pattern, that can be 
assayed by standard methylation assays of CpG island methylation status, and that comprises the 
relative methylation status of two or more genes in gastrointestinal carcinomas, normal squamous 
cells, and EAC. 

BACKGROUND OF THE INVENTION 

DNA methylation and cancer. DNA methylation patterns are frequently altered in human 
cancers. These methylation changes include genome-wide hypomethylation as well as regional 
hypermethylation (Jones & Laird, Nat Genet. 21:163-167, 1999). Aberrant hypermethylation in 
cancer cells often occurs at CpG islands, which are generally protected from methylation in 
normal tissues. Hypermethylation of promoter CpG islands (that is, CpG islands located in 
promoter regions of genes) has been associated with transcriptional silencing in many types of 
human cancers. 

Methylation patterns of genes can provide different types of useful information about a 
cancer cell. First, each tumor type {i.e., breast, colon, esophagus, etc.) has a characteristic set of 
genes with an increased propensity to become methylated (Costello et al., Nat. Genet. 24:132-138, 
2000). For example, RBI is known to be hypermethylated in retinoblastoma (Stirzaker et al., 
Cancer Res. 57:2229-2237, 1997; Sakai et al., Am. J. Hum. Genet. 48:880-888, 1991), but not in 
acute myelogenous leukemia (Kornblau & Qiu, Leuk. Lymphoma. 35:283-288, 1999; Melki et al., 
Cancer Res. 59:3730-3740, 1999). 

Second, an individual tumor within a single patient has a unique epigenetic fingerprint 
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reflective of the evolution of that tumor as compared to a tumor of the same type in a different 
patient (Costello et al., Nat. Genet. 24:132-138, 2000). 

Generally, however, most studies of epigenetic alterations in cancer have focused primarily 
on either a very small set of known genes (Jones & Laird, Nat Genet. 21:163-167, 1999; Baylin & 
Herman, Trends Genet. 16:168-174, 2000) or on the global analysis of unknown CpG islands 
(Costello et al., Nat. Genet. 24:132-138, 2000), and thus do not provide a suitable diagnostic 
and/or prognostic framework. 

Esophageal adenocarcinoma ("EAC"). Esophageal adenocarcinoma ("EAC") arises from 
a multistep process whereby normal squamous mucosa undergoes metaplasia to specialized 
columnar epithelium (Intestinal Metaplasia (JJvl) or Barrett's esophagus), which then ultimately 
progresses to dysplasia and subsequent malignancy (Barrett et al., Nat. Genet. 22:106-109, 1999; 
Zhuang et al., Cancer Res. 56:1961-4, 1996). The incidence of EAC has increased rapidly in the 
Western World over the past three decades (Devesa et al., Cancer. 83:2049-2053, 1998; 
Jankowski et al., Am. J. Pathol. 154:965-973, 1999). 

Unfortunately, epigenetic studies of this model have so far been limited to the DNA 
methylation analysis of a few genes (Wong et al, Cancer Res. 57:2619-2622, 1997; Klump et al., 
Gastroenterology. 115:1381-1386, 1998; Eads et al., Cancer Res. 60:5021-5026, 2000). 

CpG island methylator phenotype ("CIMP"). It has previously been reported that a subset 
of colorectal and gastric tumors display a CpG island methylator phenotype ("CIMP"), 
characterized by widespread, aberrant hypermethylation changes affecting multiple loci in a single 
tumor (Toyota et al., Proc. Natl. Acad. Sci. USA 96:8681-8686, 1999; Toyota et al., Cancer Res. 
59:5438-5442, 1999). This is reflected in a bimodal distribution of the frequency of the number of 
genes methylated in a group of tumors (Toyota et al, Proc. Natl. Acad. Sci. USA 96:8681-8686, 
1999). CIMP tumors are a distinct group of tumors that are defined by a high degree of 
concordant CpG island hypermethylation of genes exclusively methylated in cancer, or type C 
genes. CIMP is now thought to be a new, distinct, yet major pathway of tumorigenesis (Toyota et 
al., Proc. Natl. Acad. Sci. USA 96:8681-8686, 1999; Toyota et al., Cancer Res. 59:5438-5442, 
1999). 

However, the role, if any, of the CIMP pathway in the tumor evolution of EAC is still 
uncharacterized, because the previous epigenetic studies only analyzed one (Wong et al., Cancer 
Res. 57:2619-2622, 1997; Klump et al, Gastroenterology. 115:1381-1386, 1998) or a few genes 
(Eads et al., Cancer Res. 60:5021-5026, 2000). 

Therefore, there is a need in the art for novel methods of cancer detection, chemoprediction 
and prognostics. There is a need in the art to define novel coordinate patterns of CpG island 
methylation changes at multiple loci during different steps of a disease, such as cancer. There is a 
need in the art to determine tumor-type-specific, and patient-specific epigenetic patterns or 



fingerprints. There is a need in the art to provide biomarkers or probes, such as EAC-specific 
biomarkers or probes, that can be used in diagnostic and/or prognostic methods for the treatment 
of cancer. There is a need in the art to determine whether esophageal adenocarcinoma displays a 
CIMP. There is a need in the art for novel methods for determining the stage of a tumor. The 
5 present invention addresses these needs. 

SUMMARY OF THE INVENTION 

The present invention provides a method for diagnosing cancer or cancer-related 
conditions from tissue samples, comprising: (a) obtaining a tissue sample from a test tissue or 
10 region to be diagnosed; (b) performing a methylation assay of the tissue sample, wherein the 
methylation assay determines the methylation state of genomic CpG sequences, wherein the 
genomic CpG sequences are located within at least one gene sequence selected from the group 
consisting of APQ ARF, CALCA, CDH1, CDKN2A, CDKN2B, ESR1, GSTP1, HIC1, MGMT, 
i MLHU MYOD1, RBI, TGFBR2, THBS1, TIMP3, CTNNB1, PTGS2, TYMS and MTHFR, and 
gl5 combinations thereof; and (c) making a diagnostic or prognostic prediction of the cancer based, at 
jf least in part, upon the methylation state of the genomic CpG sequences. Preferably, the genomic 
n CpG sequences located within at least one gene sequence selected from the group consisting of 
J APC, ARF, CALCA, CDH1,CDKN2A,CDKN2B,ESR1,GSTP1,HIC1, MGMT, MLH1,MY0D1, 
1 § RBI, TGFBR2, THBS1, TIMP3, CTNNB1, PTGS2 and TYMS, correspond to genomic CpG 
320 sequences of CpG islands. Preferably, the APC, ARF, CALCA, CDH1, CDKN2A, CDKN2B, 
!! ESR1, GSTP1, HIC1, MGMT, MLH1, MYOD1, RBI, TGFBR2, THBS1, TIMP3, CTNNB1, PTGS2, 
y TYMS and MTHFR gene sequences are those defined by the specific oligonucleotide primers and 
3 probes corresponding to SEQ ID Nos: 1 -60, 64 and 65, as listed in TABLE II, or portions thereof. 

Preferably, the CpG islands are located within the promoter regions of the genes. Preferably, the 
25 APC, ARF, CALCA, CDH1, CDKN2A, CDKN2B, ESR1, GSTP1, HIC1, MGMT, MLH1, MYOD1, 
RBI, TGFBR2, THBS1, TIMP3, CTNNB1, PTGS2, and TYMS gene sequences correspond to any 
CpG island sequences associated with the sequences defined by the specific oligonucleotide 
primers and probes corresponding to SEQ ID Nos: 1-54, 58-60, 64 and 65, as listed in TABLE II, 
or portions thereof, wherein the associated CpG island sequences are those contiguous sequences 
30 of genomic DNA that encompass at least one nucleotide of the sequences defined by the specific 
oligonucleotide primers and probes corresponding to SEQ ID Nos: 1-54, 58-60, 64 and 65, and 
satisfy the criteria of having both a frequency of CpG dinucleotides corresponding to an 
Observed/Expected Ratio >0.6, and a GC Content >0.5. 

Preferably, the genomic CpG sequences are located within at least one gene sequence 
35 selected from the group consisting of APC, CDKN2A, MYODI, CALCA, ESRI, MGMT and 

TIMP3, and combinations thereof. Preferably, the genomic CpG sequences located within at least 
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one gene sequence selected from the group consisting of APC, CDKN2A, MYODI, CALCA, ESRI, 
MGMT and TIMP3, correspond to genomic CpG sequences of CpG islands. Preferably, the APC, 
CDKN2A, MYODI, CALCA, ESRI, MGMT and TIMP3 gene sequences are those defined by the 
specific oligonucleotide primers and probes corresponding to SEQ ID NOs: 19-21, SEQ ID NOs:l- 
5 3, SEQ ID NOs:7-9, SEQ ID NOs:10-12, SEQ ID NOs:4-6, SEQ ID NOs:16-18 and SEQ ID 
NOs: 13-1 5, respectively, as listed in TABLE II. Preferably, the CpG islands are located within 
the promoter regions of the genes. Preferably, the APC, CDKN2A, MYODI, CALCA, ESRI, 
MGMT and TIMP3 gene sequences correspond to any CpG island sequences associated with the 
sequences defined by the specific oligonucleotide primers and probes corresponding to SEQ ID 
10 NOs:19-21, SEQ ID NOs:l-3, SEQ ID NOs:7-9, SEQ ID NOs:10-12, SEQ ID NOs:4-6, SEQ ID 
NOs:16-18 and SEQ ID NOs:13-15, respectively, as listed in TABLE II, or portions thereof, 
wherein the associated CpG island sequences are those contiguous sequences of genomic DNA 
that encompass at least one nucleotide of the sequences defined by the specific oligonucleotide 
=:? primers and probes corresponding to SEQ ID NOs:19-21, SEQ ID NOs:l-3, SEQ ID NOs:7-9, 
j«15 SEQ ID NOs:10-12, SEQ ID NOs:4-6, SEQ ID NOs:16-18 and SEQ ID NOs:13-15, and satisfy 
{ Jf the criteria of having both a frequency of CpG dinucleotides corresponding to an 
|f| Observed/Expected Ratio >0.6, and a GC Content >0.5. 

12 Preferably, the cancer or cancer-related condition is selected from the group consisting of 

gastrointestinal or esophageal adenocarcinoma, gastrointestinal or esophageal dysplasia, 
! 320 gastrointestinal or esophageal metaplasia, Barrett's intestinal tissue, pre-cancerous conditions in 
=£; normal esophageal squamous mucosa, and combinations thereof. Preferably, the cancer is 
| y esophageal adenocarcinoma, and wherein making a diagnostic or prognostic prediction of the 
J =j cancer, based upon the methylation state of the genomic CpG sequences provides for classification 
of the adenocarcinoma by grade or stage. 
25 Preferably, the methylation assay used to determine the methylation state of genomic CpG 

sequences is selected from the group consisting of "MethylLight™", MS-SNuPE, MSP, COBRA, 
MCA, and DMH, and combinations thereof. 

Preferably, the methylation assay used to determine the methylation state of genomic CpG 
sequences is based, at least in part, on an array or microarray comprising CpG sequences located 
30 within at least one gene sequence selected from the group consisting of APC, ARF, CALCA, 
CDH1, CDKN2A, CDKN2B, ESRI, GSTPI, HIC1, MGMT, MLHI, MYODI, RBI, TGFBR2, 
THBS1 , TIMP3, CTNNB1, PTGS2, TYMS and MTHFR. Preferably, the APC, ARF, CALCA, 
CDH1, CDKN2A, CDKN2B, ESRI, GSTPI, HIC1, MGMT, MLHI, MYODI, RBI, TGFBR2, 
THBS1, TIMP3, CTNNB1, PTGS2, and TYMS gene sequences correspond to any CpG island 
35 sequences associated with the sequences defined by the specific oligonucleotide primers and 
probes corresponding to SEQ ID Nos:l-54, 58-60, 64 and 65, as listed in TABLE II, or portions 
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thereof, wherein the associated CpG island sequences are those contiguous sequences of genomic 
DNA that encompass at least one nucleotide of the sequences defined by the specific 
oligonucleotide primers and probes corresponding to SEQ ID Nos: 1 -54, 58-60, 64 and 65, and 
satisfy the criteria of having both a frequency of CpG dinucleotides corresponding to an 
5 Observed/Expected Ratio >0.6, and a GC Content >0.5. Preferably, the APC, ARF, CALCA, 
CDH1, CDKN2A, CDKN2B, ESR1, GSTP1, MCI, MGMT, MLH1, MYOD1, RBI, TGFBR2, 
THBS1, TIMP3, CTNNB1, PTGS2, TYMS and MTHFR gene sequences are those defined by, or 
correspond to the specific oligonucleotide primers and probes corresponding to SEQ ID Nos: 1-60, 
64 and 65, as listed in TABLE II, or portions thereof. 
10 Preferably, the methylation state of genomic CpG sequences that is determined is that of 

hypermethylation, hypomethylation or normal methylation. 

The present invention also provides a kit useful for diagnosis or prognosis of cancer or 
cancer-related conditions, comprising a carrier means containing one or more containers 
comprising: (a) a container containing a probe or primer which hybridizes to any region of a 
1 1 1 5 sequence located within at least one gene sequence selected from the group consisting of APC, 
| AR F, CALCA, CDH1, CDKN2A, CDKN2B, ESR1, GSTP1, HIC1, MGMT, MLH1, MYOD1, RBI, 
| TGFBR2,THBS1,TIMP3,CTNNB1,PTGS2,TYMS and MTHFR; and (b) additional standard 
m methylation assay reagents required to affect detection of methylated CpG-containing nucleic acid 
■ ; based, at least in part, on the probe or primer. Preferably, the additional standard methylation 
i320 assay reagents are standard reagents for performing a methylation assay from the group consisting 
jg; of MethyLight™, MS-SNuPE, MSP, COBRA, MCA and DMH, and combinations thereof 
iO Preferably, the probe or primer comprises at least about 12 to 1 5 nucleotides of a sequence 
i 3 selected from the group consisting of SEQ ID Nos : 1 -60, 64 and 65, as listed in TABLE II. 

The present invention further provides a kit useful for diagnosis or prognosis of cancer or 
25 cancer-related conditions, comprising a carrier means containing one or more containers 

comprising: (a) an array or micorarray comprising sequences of at least about 12 to 15 nucleotides 
of a sequence selected from the group consisting of SEQ ID Nos: 1-60, 64, 65, and any sequence 
located within a CpG island sequence associated with SEQ ID NOs:l-54, 58-60, 64 and 65. 

30 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows, according to the present invention, a quantitative methylation analysis of a 
panel of 20 genes from a screen of 84 tissue specimens from 31 patients with different stages of 
Barrett's esophagus ("IM"), dysplasia ("DYS") and/or associated esophageal adenocarcinoma 
("T"). Methylation analysis was performed using the MethyLight™ assay (Eads et al., Cancer 

35 Res. 59:2302-2306, 1999; Eads et al, Nucleic Acids Res. 28:E32, 2000). The percentage of fully 
methylated molecules at a specific locus (PMR = Percent of Methylated Reference) was calculated 
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by dividing the GENE/ACTB ratio of a sample by the GENE/ACTB ratio of &sl-treated sperm 
DNA and multipling by 100. The resulting percentages were then dichotomized at 4% PMR to 
facilitate graphical representation and to reveal tissue-specific patterns (as described herein). "N" 
indicates an analysis for which the control gene ACTB did not reach sufficient levels to allow the 
detection of a minimal value of 1 PMR for that methylation reaction in that particular sample. 

Figure 2 shows the percent of samples methylated for each gene by tissue type. The data 
was dichotomized at 4 PMR, with 4 PMR and higher designated as methylated, and below 4 PMR 
as unmethylated. The genes, according to the present invention, were grouped according to their 
respective epigenetic gene classes (A-G) as shown in Figure 1. The letter "n" equals the number 
of samples analyzed for each tissue. 

Figure 3 shows a comparison of epigenetic profiles according to the present invention. 
The data was dichotomized at 4 PMR, with 4 PMR and higher designated as methylated, and 
below 4 PMR as unmethylated. Error bars represent the standard error of the mean. Top panel: 
Mean percent of genes methylated in each gene Class (A-F or ALL 19 CpG islands) by tissue type 
(N, normal esophagus; S, stomach; EVI, intestinal metaplasia; DYS, dysplasia; T, 
adenocarcinoma). The error bars represent the standard error of the mean (SEM). Bottom panel: 
Statistical analysis of the difference in mean percent of genes methylated in different tissues by 
gene Class (A-F) or for all 19 CpG islands combined (ALL). The p- values were generated by a 
Fisher's Protected Least Significant Difference (PLSD) test, adapted for use with unequal sample 
numbers (SAS Statview™ software). 

Figure 4 shows the relationship between Class A methylation frequency and tumor stage 
according to the present invention. The data was dichotomized at 4 PMR, with 4 PMR and higher 
designated as methylated, and below 4 PMR as unmethylated. Upper panel: Mean number of 
genes methylated for Class A with respect to tumor stage (I-IV) is shown (see Figure 1). The error 
bars represent the standard error of the mean (SEM). The letter "n" equals the number of samples 
analyzed in each tumor stage. Lower panel: Statistical analysis of the difference in mean number 
of Class A genes methylated by tumor stage. The ^-values were generated by a Fisher's Protected 
Least Significant Difference (PLSD) test, adapted for use with unequal sample numbers (SAS 
Statview™ software). 

Figure 5 shows, according to the present invention, the percent of two or more Class A 
genes methylated in intestinal metaplasia ("IM") tissues with ("Y"), or without ("N") associated 
dysplasia and/or adenocarcinoma. The data was dichotomized at 4 PMR, with 4 PMR and higher 
designated as methylated, and below 4 PMR as unmethylated. Left panel: Class A methylation in 
the IM data illustrated in Figure 1. Right panel: Class A methylation in the IM for a completely 
independent follow-up study of twenty different microdissected IM samples. The error bars 
represent the standard error of the mean (SEM). The letter "n" equals the number of samples 



analyzed in each tissue group. 

Figure 6 shows, according to the present invention, methylation frequency distributions in 
the progression of esophageal adenocarcinoma. The data was dichotomized at 4 PMR, with 4 
PMR and higher designated as methylated, and below 4 PMR as unmethylated. The proportion of 
5 patients with zero to three (Class A), zero to nine (Classes A + D) and zero to fourteen CpG 
islands (Classes A + B +C + D) methylated in each tissue is shown. Class E and F CpG islands 
were not included since there was no variation in the frequency of methylation between the 
different tissue. The letter "n" equals the number of samples analyzed in each tissue. 

10 

DETAILED DESCRIPTION OF THE INVENTION 
Definitions: 

rj The term "EAC" refers to esophageal adenocarcinoma, but also encompasses different 

0 1 5 histological stages of esophageal adenocarcinoma corresponding to a multistep process whereby 
U normal squamous mucosa undergoes metaplasia to specialized columnar epithelium (Intestinal 
II Metaplasia (IM) or Barrett's esophagus), which then ultimately progresses to dysplasia and 
Ji subsequent malignancy (Barrett et al., Nat. Genet 22:106-109, 1999; Zhuang et al, Cancer Res, 
11 56:1961-4, 1996); 

320 The term "CIMP" refers to CpG island methylator phenotype, characterized by widespread 

|: aberrant hypermethylation changes affecting multiple loci in a single tumor. This is reflected in a 
y bimodal distribution of the frequency of the number of genes methylated in a group of tumors 
^ (1 6). CIMP tumors are a distinct group of tumors that are defined by a high degree of concordant 

CpG island hypermethylation of genes exclusively methylated in cancer, or type C genes. CIMP is 
25 now thought to be a new, distinct, yet major pathway of tumorigenesis (Toyota et al., Proc. Natl. 

Acad. ScL USA 96:8681-8686, 1999; Toyota et al, Cancer Res. 59:5438-5442, 1999) (see 

"Background," above); 

The term "PMR" refers to percent of methylated reference, and is calculated as described 
herein under Example I; 

30 "GC Content" refers, within a particular DNA sequence, to the [(number of C bases + 

number of G bases) / band length for each fragment]; 

"Observed/Expected Ratio" ("O/E Ratio") refers to the frequency of CpG dinucleotides 

within a particular DNA sequence, and corresponds to the [number of CpG sites / (number of C 

bases X number of G bases)] X band length for each fragment; 
35 "CpG Island" refers to a contiguous region of genomic DNA that satisfies the criteria of 

(1) having a frequency of CpG dinucleotides corresponding to an "Observed/Expected Ratio" 
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>0.6), and (2) having a "GC Content" >0.5. CpG islands are typically, but not always, between 
about 0.2 to about 1 kb in length. A CpG island sequence associated with a particular SEQ ID NO 
sequence of the present invention is that contiguous sequence of genomic DNA that encompasses 
at least one nucleotide of the particular SEQ ED NO sequence, and satisfies the criteria of having 
5 both a frequency of CpG dinucleotides corresponding to an Observed/Expected Ratio >0.6), and a 
GC Content >0.5; 

"Methylation state" refers to the presence or absence of 5-methylcytosine ("5-mCyt") at 
one or a plurality of CpG dinucleotides within a DNA sequence; 

"Hypermethylation" refers to the methylation state corresponding to an increased presence 
10 of 5-mCyt at one or a plurality of CpG dinucleotides within a DNA sequence of a test DNA 
sample, relative to the amount of 5-mCyt found at corresponding CpG dinucleotides within a 
normal control DNA sample; 

"Hypomethylation" refers to the methylation state corresponding to a decreased presence 
% of 5-mCyt at one or a plurality of CpG dinucleotides within a DNA sequence of a test DNA 
SI 5 sample, relative to the amount of 5-mCyt found at corresponding CpG dinucleotides within a 
U normal control DNA sample; 

rJ "Methylation assay" refers to any assay for determining the methylation state of a CpG 

fl dinucleotide within a sequence of DNA; 

1 1 "MS.AP-PCR" (Methylation-Sensitive Arbitrarily-Primed Polymerase Chain Reaction) 

320 refers to the art-recognized technology that allows for a global scan of the genome using CG-rich 
^ primers to focus on the regions most likely to contain CpG dinucleotides, and described by 
5 Gonzalgo et al, Cancer Research 57:594-599, 1997; 

3 "MethyLight" refers to the art-recognized fluorescence-based real-time PCR technique 

described by Eads et al., Cancer Res. 59:2302-2306, 1999; 
25 "Ms-SNuPE" (Methylation-sensitive Single Nucleotide Primer Extension) refers to the art- 

recognized assay described by Gonzalgo & Jones, Nucleic Acids Res. 25:2529-2531, 1997; 

"MSP" (Methylation-specific PCR) refers to the art-recognized methylation assay 
described by Herman et al. Proa Natl Acad. Sci. USA 93:9821-9826, 1996, and by US Patent No. 
5,786,146; 

30 "COBRA" (Combined Bisulfite Restriction Analysis) refers to the art-recognized 

methylation assay described by Xiong & Laird, Nucleic Acids Res. 25:2532-2534, 1997; 

"MCA" (Methylated CpG Island Amplification) refers to the methylation assay described 
by Toyota et al., Cancer Res. 59:2307-12, 1999, and in WO 00/26401 Al; 

"DMH" (Differential Methylation Hybridization) refers to the art-recognized methylation 
35 assay described in Huang et al, Hum. Mol Genet., 8:459-470, 1999, and in Yan et al, Clin. 
Cancer Res. 6:1432-38, 2000; 



8 



Genes and associated literature references: 

"APC refers to the adenomatous polyposis coli gene (Eads et al., Cancer Res. 59:2302- 
2306, 1999; Hiltunen et el., Int. J. Cancer. 70:644-648, 1997); 
5 "ARF' refers to the P14 cell cycle regulator, tumor suppressor gene (Esteller et al., Cancer 

Res. 60:129-133, 2000; Robertson & Jones, Mol. Cell. Biol. 18:6457-6473, 1998); 

"CALCA" refers to the calcitonin gene (Melki et al., Cancer Res. 59:3730-3740, 1999; 
Hakkarainen et al., Int. J. Cancer. 69:471-474, 1996); 

"CDH1" refers to the E-cadherin gene (Melki et al., Cancer Res. 59:3730-3740, 1999; 
10 Ueki et al., Cancer Res. 60: 1835-1 839, 2000); 

"CDKN2A" refers to the P16 gene (Jones & Laird, Nat. Genet. 21:163-167, 1999; Melki et 
al., Cancer Res. 59:3730-3740, 1999; Baylin & Herman, Trends Genet. 16:168-174, 2000; 
Cameron et al., Nat. Genet. 21:103-107, 1999; Ueki et al., Cancer Res. 60:1835-1839, 2000); 

"CDKN2B" refers to the P15 gene (Melki et al., Cancer Res. 59:3730-3740, 1999; 
15 Cameron et al., Nat. Genet. 21:103-107, 1999); 

"CTNNB1" refers to the beta-catenin gene; 

"ESR1" refers to the estrogen receptor alpha gene (Jones & Laird, Nat. Genet. 21:163-167, 
1999; Baylin & Herman, Trends Genet. 16:168-174, 2000); 

"GSTP1" refers to the glutathione S-transferase PI gene (Melki et al., Cancer Res. 
20 59:3730-3740, 1999; Tchou et al., Int. J. Oncol. 16:663-676, 2000); 

"HIC1" refers to the hypermethylated in cancer 1 gene (Melki et al., Cancer Res. 59:3730- 
3740, 1999; Wales et al., Nat Med. 1:570-577, 1995); 

"MGMr' refers to the 06-methylguanine-DNA methyltransferase gene (Esteller et al., 
Cancer Res. 59:793-797, 1999); 
25 "MLH1" refers to the Mut L homologue 1 gene (Jones & Laird, Nat. Genet. 21:1 63-167, 

1999; Baylin & Herman, Trends Genet. 16:168-174, 2000; Cameron et al., Nat. Genet. 21:103- 
107, 1999; Esteller et al., Am. J. Pathol. 155:1767-1772, 1999, Ueki et al., Cancer Res. 60:1835- 
1839, 2000); 

"MTHFR" refers to the methyl-tetrahydrofolate reductase gene (Pereira et al., Oncol. Rep. 
30 6:597-599, 1999); 

"MYOD1" refers to the myogenic determinant 1 gene (Eads et al., Cancer Res. 59:2302- 
2306, 1999; Cheng et al., Br. J. Cancer. 75:396-402, 1997); 

"PTGS2" refers to the cyclooxygenase 2 gene (Zimmermann et al., Cancer Res. 59:198- 
204, 1999); 

35 "RBI" refers to the retinoblastoma gene (Stirzaker et al., Cancer Res. 57:2229-2237, 1 997; 

Sakai et al., Am. J. Hum. Genet. 48:880-888, 1991); 
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"TGFBR2" refers to the transforming growth factor beta receptor II gene (Kang et al., 
Oncogene. 18:7280-7286, 1999; Hougaard et al., Br. J. Cancer. 79:1005-101 1, 1999); 

"THBSF refers to the thrombospondin 1 gene (Ueki et al., Cancer Res. 60:1835-1839, 
2000; Li et al., Oncogene. 18:284-3289, 1999); 
5 "7TMP3" refers to the tissue inhibitor of metallinoproteinase 3 gene (Cameron et al., Nat. 

Genet. 21:103-107, 1999; Ueki et al, Cancer Res. 60:1835-1839, 2000; Bachman et al, Cancer 
Res. 59:798-802, 1999); 

"7YMS7" refers to the thymidylate synthetase gene (Sakamoto et al., In: L. Herrera (ed.) 
Familial adenomatous polyposis, pp. 315-324. New York: Alan R. Liss, 1990). 
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Overview 

The present invention encompasses a broad, multi-gene approach that provides novel and 
therapeutically useful insight into concordant methylation behavior between and among genes. In 

15 particular embodiments, the present invention provides novel epigenomic fingerprints for the 
different histological stages of esophageal adenocarcinoma (EAC). 

More specifically, the present invention combines the advantages of both targeted and 
comprehensive approaches by analyzing 20 different genes (see Table 1, below) using a 
quantitative, high-throughput methylation assay, "MethyLight™" (Eads et al., Cancer Res. 

20 59:2302-2306, 1999; Eads et al., Cancer Res. 60:5021-5026, 2000; Eads et al, Nucleic Acids Res. 
28:E32, 2000), to (i) more extensively characterize the methylation changes in esophageal 
adenocarcinoma (EAC); to (ii) generate epigenomic fingerprints for the different histological 
stages of EAC; to (iii) identify epigenetic biomarkers useful in disease diagnosis and prevention; 
and to (iv) determine if CIMP is a contributor to the tumorigenesis of esophageal adenocarcinoma 

25 tumors. 

A total of 104 tissue specimens from 51 patients with different stages of Barrett's 
esophagus and/or associated adenocarcinoma were analyzed. Specifically, 84 of these tissue 
specimens were screened with the full panel of 20 genes, revealing distinct classes of methylation 
patterns in the different types of tissue. 

30 The most informative genes, for purposes of the present invention, were those with an 

intermediate frequency of significant hypermethylation (i.e., those ranging from about 15% 
(CDKN2A) to about 60% (MGMT) of the samples). This group of genes could be further 
subdivided into three classes, according to the (1) absence (CDKN2A, ESR1 and MYODl\ or (2) 
presence (CALCA, MGMT and TIMP3) of methylation in normal esophageal mucosa and stomach, 

35 or (3) the infrequent methylation of normal esophageal mucosa accompanied by methylation in all 
normal stomach samples (APC). 
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The other genes were relatively less informative, since the frequency of hypermethylation 
was below about 5% (ARF, CDH1 9 CDKN2B, GSTPU MLH1, PTGS2 and THBS1\ completely 
absent (CTNNB1, RBI, TGFBR2 and TYMS1) or ubiquitous (HIC1 and MTHFR), regardless of 
tissue type. 

5 Each class of gene undergoes unique epigenetic changes at different steps of disease 

progression of EAC, consistent with a step-wise loss of multiple protective barriers against CpG 
island hypermethylation. The aberrant hypermethylation occurs at many different loci in the same 
tissues, consistent with an overall deregulation of methylation control in EAC tumorigenesis. 
However, there was no clear evidence for a distinct group of tumors with a CpG island methylator 

10 phenotype ("CIMP"). 

Additionally, normal and metaplastic tissues from patients with evidence of associated 
dysplasia or cancer displayed a significantly higher incidence of hypermethylation than similar 
tissues from patients with no further progression of their disease. The fact that the samples from 
these two groups of patients were histologically indistinguishable, yet molecularly distinct, 

15 indicates, according to the present invention, that the occurrence of such hypermethylation 

provides a novel and valuable clinical tool to identify patients with pre-malignant Barrett's, who 
are at risk for further progression. 

TABLE I shows a list of gene names and functions analyzed by the MethyLight™ assay in 
EAC. The genes are listed in alphabetical order based on their designated HUGO (HUman 

20 Genome Organization) names. The genes are divided into three groups according to whether or 
not they have CpG islands and are known to be methylated in other tumors. A brief description of 
the function of each gene is included. 
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Diagnostic and Prognostic Assays for Cancer 

The present invention provides for diagnostic and prognostic cancer assays based on 
determination of the methylation state of one or more of the disclosed 20 gene sequences (APC, 
5 ARF, CALCA, CDH1, CDKN2A, CDKN2B, ESR1, GSTP1, HIC1, MGMT, MLH1, MYOD1, RBI, 
TGFBR2, THBS1, TIMP3, CTNNB1, PTGS2, TYMS and MTHFR; see TABLES I and II, below; 
and see under "Definitions," above), or methylation-altered DNA sequence embodiments thereof. 
These 20 gene sequence regions are defined herein by the oligomeric primers and probes 
corresponding to SEQ ID NOS:1-60, 64 and 65 (see TABLE II, below). SEQ ID NOS:61-63 
1 0 correspond to the ACTB "control" gene region used in the present analysis (see EXAMPLE 1 , 
below). 

Additionally, 19 of these 20 gene sequence regions correspond to CpG islands or regions 
thereof (based on GC Content and O/E ratio); namely APC, ARF, CALCA, CDH1, CDKN2A, 
CDKN2B, ESR1, GSTP1, HIC1, MGMT, MLH1, MYOD1, RBI, TGFBR2, THBS1, TIMP3, 

15 CTNNB1, PTGS2 and TYMS (see TABLE 1 , below). Thus, based on the fact that the methylation 
state of a portion of a given CpG island is generally representative of the island as a whole, the 
present invention further encompasses the novel use of any sequences within the 19 complete CpG 
islands associated with these 19 gene sequence regions (defined herein by the primers and probes 
corresponding to SEQ ID NOS:1-60, 64 and 65 (see TABLE II, below) in cancer prognostic and 

20 diagnostic applications), where a CpG island sequence associated with one of these 19 gene 

sequences is that contiguous sequence of genomic DNA that encompasses at least one nucleotide 
of one of these 19 gene sequences, and satisfies the criteria of having both a frequency of CpG 
dinucleotides corresponding to an Observed/Expected Ratio >0.6, and a GC Content >0.5. 

Typically, such assays involve obtaining a tissue sample from a test tissue, performing a 

25 methylation assay on DNA derived from the tissue sample to determine the associated methylation 
state, and making a diagnosis or prognosis based thereon. 

The methylation assay is used to determine the methylation state of one or a plurality of 
CpG dinucleotide within a DNA sequence of the DNA sample. According to the present 
invention, possible methylation states include hypermethylation and hypomethylation, relative to a 

30 normal state (i.e., non-cancerous control state). Hypermethylation and hypomethylation refer to 
the methylation states corresponding to an increased or decreased, respectively, presence of 5- 
methylcytosine ("5-mCyt") at one or a plurality of CpG dinucleotides within a DNA sequence of 
the test sample, relative to the amount of 5-mCyt found at corresponding CpG dinucleotides 
within a normal control DNA sample. 

35 A diagnosis or prognosis is based, at least in part, upon the determined methylation state of 

the sample DNA sequence compared to control data obtained from normal, non-cancerous tissue. 
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Methylation Assay Procedures 

Various methylation assay procedures are known in the art, and can be used in conjunction 
5 with the present invention. These assays allow for determination of the methylation state of one or 
a plurality of CpG dinucleotides within a DNA sequence (e.g., CpG islands). Such assays involve, 
among other techniques, DNA sequencing of bisulfite-treated DNA, PCR (for sequence-specific 
amplification), Southern blot analysis, use of methylation-sensitive restriction enzymes, etc. 

For example, genomic sequencing has been simplified for analysis of DNA methylation 
10 patterns and 5-methylcytosine distribution by using bisulfite treatment (Frommer et al., Proc. Natl. 
Acad. Sci. USA 89:1827-1831, 1992). Additionally, restriction enzyme digestion of PCR products 
amplified from bisulfite-converted DNA is used, e.g., the method described by Sadri & Hornsby 
(Nucl. Acids Res. 24:5058-5059, 1996), or COBRA (Combined Bisulfite Restriction Analysis) 
2 (Xiong & Laird, Nucleic Acids Res. 25:2532-2534, 1997). 

gl5 Preferably, assays such as "MethyLight™" (a fluorescence-based real-time PCR 

U technique) (Eads et al, Cancer Res. 59:2302-2306, 1999), Methylation-sensitive Single 

~ n Nucleotide Primer Extension reactions ("Ms-SnuPE"; Gonzalgo & Jones, Nucleic Acids Res. 

H 25:2529-2531, 1997), methylation-specific PCR ("MSP"; Herman et al., Proc. Natl. Acad. Sci. 

* s USA 93:9821-9826, 1996; US Patent No. 5,786,146), and methylated CpG island amplification 

:20 ("MCA";Toyota et al, Cancer Res. 59:2307-12, 1999) are used alone or in combination with 

|! other of these methods. Methylation assays that can be used in various embodiments of the 

II present invention include, but are not limited to, the following assays. 

^ COBRA ( Combined Bisulfite Restriction Analysis). COBRA analysis is a quantitative 

methylation assay useful for determining DNA methylation levels at specific gene loci in small 
25 amounts of genomic DNA (Xiong & Laird, Nucleic Acids Res. 25:2532-2534, 1997). Briefly, 

restriction enzyme digestion is used to reveal methylation-dependent sequence differences in PCR 
products of sodium bisulfite-treated DNA. Methylation-dependent sequence differences are first 
introduced into the genomic DNA by standard bisulfite treatment according to the procedure 
described by Frommer et al. (Proc. Natl. Acad. ScL USA 89:1827-1831, 1992). PCR amplification 
30 of the bisulfite converted DNA is then performed using primers specific for the interested CpG 
islands, followed by restriction endonuclease digestion, gel electrophoresis, and detection using 
specific, labeled hybridization probes. Methylation levels in the original DNA sample are 
represented by the relative amounts of digested and undigested PCR product in a linearly 
quantitative fashion across a wide spectrum of DNA methylation levels. Additionally, this 
35 technique can be reliably applied to DNA obtained from microdissected paraffin-embedded tissue 
samples. Typical reagents (e.g., as might be found in a typical COBRA-based methylation kit) for 
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COBRA analysis may include, but are not limited to: PCR primers for specific gene (or 
methylation-altered DNA sequence or CpG island); restriction enzyme and appropriate buffer; 
gene-hybridization oligo; control hybridization oligo; kinase labeling kit for oligo probe; and 
radioactive nucleotides (although other label schemes known in the art including, but not limited, 
5 to fluorescent and phosphorescent schemes can be used). Additionally, bisulfite conversion 

reagents may include: DNA denaturation buffer; sulfonation buffer; DNA recovery regents or kit 
{e.g., precipitation, ultrafiltration, affinity column); desulfonation buffer; and DNA recovery 
components. 

Ms-SnuPE (Methylation-sensitive Single Nucleotide Primer Extension). The Ms-SNuPE 

10 technique is a quantitative method for assessing methylation differences at specific CpG sites 

based on bisulfite treatment of DNA, followed by single-nucleotide primer extension (Gonzalgo & 
Jones, Nucleic Acids Res. 25:2529-2531, 1997). Briefly, genomic DNA is reacted with sodium 
bisulfite to convert unmethylated cytosine to uracil while leaving 5-methylcytosine unchanged. 
Amplification of the desired target sequence is then performed using PCR primers specific for 

15 bisulfite-converted DNA, and the resulting product is isolated and used as a template for 

methylation analysis at the CpG site(s) of interest. Small amounts of DNA can be analyzed (e.g., 
microdissected pathology sections), and it avoids utilization of restriction enzymes for 
determining the methylation status at CpG sites. Typical reagents {e.g., as might be found in a 
typical Ms-SNuPE-based methylation kit) for Ms-SNuPE analysis may include, but are not limited 

20 to: PCR primers for specific gene (or methylation-altered DNA sequence or CpG island); 

optimized PCR buffers and deoxynucleotides; gel extraction kit; positive control primers; Ms- 
SNuPE primers for specific gene; reaction buffer (for the Ms-SNuPE reaction); and radioactive 
nucleotides. Additionally, bisulfite conversion reagents may include: DNA denaturation buffer; 
sulfonation buffer; DNA recovery regents or kit {e.g., precipitation, ultrafiltration, affinity 

25 column); desulfonation buffer; and DNA recovery components. 

MSP (Methylation-specific PCR). MSP allows for assessing the methylation status of 
virtually any group of CpG sites within a CpG island, independent of the use of methylation- 
sensitive restriction enzymes (Herman et al. Proc. Natl. Acad. Sci. USA 93:9821-9826, 1996; US 
Patent No. 5,786,146). Briefly, DNA is modified by sodium bisulfite converting all unmethylated, 

30 but not methylated cytosines to uracil, and subsequently amplified with primers specific for 

methylated versus unmethylated DNA. MSP requires only small quantities of DNA, is sensitive 
to 0.1% methylated alleles of a given CpG island locus, and can be performed on DNA extracted 
from paraffin-embedded samples. Typical reagents {e.g., as might be found in a typical MSP- 
based kit) for MSP analysis may include, but are not limited to: methylated and unmethylated 

35 PCR primers for specific gene (or methylation-altered DNA sequence or CpG island), optimized 
PCR buffers and deoxynucleotides, and specific probes. 
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MCA (Methylated CpG Island Amplification) . The MCA technique is a method that can be 
used to screen for altered methylation patterns in genomic DNA, and to isolate specific sequences 
associated with these changes (Toyota et al., Cancer Res, 59:2307-12, 1999). Briefly, restriction 
enzymes with different sensitivities to cytosine methylation in their recognition sites are used to 
5 digest genomic DNAs from primary tumors, cell lines, and normal tissues prior to arbitrarily 
primed PCR amplification. Fragments that show differential methylation are cloned and 
sequenced after resolving the PCR products on high-resolution polyacrylamide gels. The cloned 
fragments are then used as probes for Southern analysis to confirm differential methylation of 
these regions. Typical reagents (e.g., as might be found in a typical MCA-based kit) for MCA 
10 analysis may include, but are not limited to: PCR primers for arbitrary priming Genomic DNA; 
PCR buffers and nucleotides, restriction enzymes and appropriate buffers; gene-hybridization 
oligos or probes; control hybridization oligos or probes. 

DMH (Differential Methylation Hybridization). DMH refers to the art-recognized, array- 
%_ based methylation assay described in Huang et al., Hum. Mol Genet., 8:459-470, 1999, and in 
El 5 Yan et al., Clin. Cancer Res. 6:1432-38, 2000. DMH allows for a genome-wide screening of CpG 
Jf island hypermethylation in cancer cell lines, and. Briefly, CpG island tags are arrayed on solid 
f| supports (e.g., nylon membranes, silicon, etc.), and probed with "amplicons" representing a pool 
t! of methylated CpG DNA, from test (e.g., tumor) or reference samples. The differences in test and 

reference signal intensities on screened CpG island arrays reflect methylation alterations of 
320 corresponding sequences in the test DNA. 

MethyLight . In preferred embodiments, the MethyLight assay is used to determine the 
methylation status of one or more CpG sequences. The MethyLight™ assay is a high-throughput 
*f quantitative methylation assay that utilizes fluorescence-based real-time PCR (TaqMan ®) 

technology that requires no further manipulations after the PCR step (Eads et al, Cancer Res. 
25 60:5021-5026, 2000; Eads et al, Cancer Res. 59:2302-2306, 1999; Eads et al., Nucleic Acids Res. 
28:E32, 2000). Briefly, the MethyLight™ process begins with a mixed sample of genomic DNA 
that is converted, in a sodium bisulfite reaction, to a mixed pool of methylation-dependent 
sequence differences according to standard procedures (the bisulfite process converts 
unmethylated cytosine residues to uracil). Fluorescence-based PCR is then performed either in an 
30 "unbiased" (with primers that do not overlap known CpG methylation sites) PCR reaction, or in a 
"biased" (with PCR primers that overlap known CpG dinucleotides) reaction. Sequence 
discrimination can occur either at the level of the amplification process or at the level of the 
fluorescence detection process, or both. 

The MethyLight™ assay may assay be used as a quantitative test for methylation patterns 
35 in the genomic DNA sample, wherein sequence discrimination occurs at the level of probe 

hybridization. In this quantitative version, the PCR reaction provides for unbiased amplification 
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in the presence of a fluorescent probe that overlaps a particular putative methylation site. An 
unbiased control for the amount of input DNA is provided by a reaction in which neither the 
primers, nor the probe overlie any CpG dinucleotides. Alternatively, a qualitative test for genomic 
methylation is achieved by probing of the biased PGR pool with either control oligonucleotides 
5 that do not "cover" known methylation sites (a fluorescence-based version of the "MSP" 
technique), or with oligonucleotides covering potential methylation sites. 

The MethyLight™ process can by used with a "TaqMan®" probe in the amplification 
process. For example, double-stranded genomic DNA is treated with sodium bisulfite and 
subjected to one of two sets of PCR reactions using TaqMan® probes; e.g., with either biased 
10 primers and TaqMan® probe, or unbiased primers and TaqMan® probe. The TaqMan® probe is 
dual-labeled with fluorescent "reporter" and "quencher" molecules, and is designed to be specific 
for a relatively high GC content region so that it melts out at about 10°C higher temperature in the 
PCR cycle than the forward or reverse primers. This allows the TaqMan® probe to remain fully 
t[ hybridized during the PCR annealing/extension step. As the Taq polymerase enzymatically 
gl5 synthesizes a new strand during PCR, it will eventually reach the annealed TaqMan® probe. The 
U Taq polymerase 5' to 3' endonuclease activity will then displace the TaqMan® probe by digesting 
' n it to release the fluorescent reporter molecule for quantitative detection of its now unquenched 
p signal using a real-time fluorescent detection system. 

Typical reagents (e.g., as might be found in a typical MethyLight™ -based methylation 
320 kit) for MethyLight™ analysis may include, but are not limited to: PCR primers for specific gene 
H (or methylation-altered DNA sequence or CpG island); TaqMan® probes; optimized PCR buffers 
y and deoxynucleotides; and Taq polymerase. A detailed description of four alternate process 
□ applications ("A" through "D") of the MethyLight™ assay follows below. Preferably, the 

quantitative MethyLight™ process application "B" is used. 
25 MethyLight™-based detection of the methylated nucleic acid is relatively rapid and is 

based on amplification-mediated displacement of specific oligonucleotide probes. In a preferred 
embodiment, amplification and detection, in fact, occur simultaneously as measured by 
fluorescence-based real-time quantitative PCR ("RT-PCR") using specific, dual-labeled TaqMan® 
oligonucleotide probes, with no requirement for subsequent manipulation or analysis. The 
30 displaceable probes can be specifically designed to distinguish between methylated and 
unmethylated CpG sites present in the original, unmodified nucleic acid sample. 

Like the technique of methylation-specific PCR ("MSP"; US Patent 5,786,146), 
MethyLight™ provides for significant advantages over previous PCR-based and other methods 
(e.g., Southern analyses) used for determining methylation patterns. MethyLight™ is substantially 
35 more sensitive than Southern analysis, and facilitates the detection of a low number (percentage) 
of methylated alleles in very small nucleic acid samples, as well as paraffin-embedded samples. 
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Moreover, in the case of genomic DNA, analysis is not limited to DNA sequences recognized by 
methylation-sensitive restriction endonucleases, thus allowing for fine mapping of methylation 
patterns across broader CpG-rich regions. MethyLight™ also eliminates any false-positive results, 
that otherwise might result from incomplete digestion by methylation-sensitive restriction 
5 enzymes, inherent in previous PCR-based methylation methods. 

MethyLight™ can be applied as a quantitative process for measuring methylation amounts, 
and is substantially more rapid than other methods. MethyLight™ does not require any post-PCR 
manipulation or processing. This not only greatly reduces the amount of labor involved in the 
analysis of bisulfite-treated DNA, but it also provides a means to avoid handling of PCR products 
10 that could contaminate future reactions. 

One process embodiment uses MethyLight™ for the unbiased amplification of all possible 
methylation states using primers that do not cover any CpG sequences in the original, unmodified 
DNA sequence. To the extent that all methylation patterns are amplified equally, quantitative 
f information about DNA methylation patterns are then distilled from the resulting PCR pool by any 
J 5 technique capable of detecting sequence differences (e.g., by fluorescence-based PCR). 
W MethyLight™ employs one or a series of CpG-specific TaqMan® probes, each 

== corresponding to a particular methylation site in a given amplified DNA region, are constructed. 
P This series of probes is then utilized in parallel amplification reactions, using aliquots of a single, 
' modified DNA sample, to simultaneously determine the complete methylation pattern present in 
3>0 the original unmodified sample of genomic DNA. This is accomplished in a fraction of the time 
5 and expense required for direct sequencing of the sample of genomic DNA, and are substantially 
y more sensitive. Moreover, one embodiment of MethyLight™ provides for a quantitative 
3 assessment of such a methylation pattern. 

The present invention, as described herein, may be practiced using a variety of methylation 
25 assays. For MethyLight™ emabodiments, there are four process techniques and associated 

diagnostic kits that a methylation-dependent nucleic acid modifying agent (e.g., bisulfite), to both 
qualitatively and quantitatively determine CpG methylation status in nucleic acid samples (e.g., 
genomic DNA samples). The four processes are described herein as processes "A," "B," "C" and 
"D." Overall, methylated-CpG sequence discrimination is designed to occur at the level of 
30 amplification, probe hybridization or at both levels. For example, applications C and D utilize 

"biased" primers that distinguish between modified unmethylated and methylated nucleic acid and 
provide methylated-CpG sequence discrimination at the PCR amplification level Process B uses 
"unbiased" primers (that do not cover CpG methylation sites), to provide for unbiased 
amplification of modified nucleic acid, but rather utilize probes that distinguish between modified 
35 unmethylated and methylated nucleic acid to provide for quantitative methylated-CpG sequence 
discrimination at the detection level (e.g., at the fluorescent (or luminescent) probe hybridization 
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level only). Process A does not, in itself, provide for methylated-CpG sequence discrimination at 
either the amplification or detection levels, but supports and validates the other three applications 
by providing control reactions for input DNA. 

J 7~K Jf HP1V A 

MethyLight Process D. In a first MethyLight embodiment, the invention provides a 
5 method for qualitatively detecting a methylated CpG-containing nucleic acid, the method 
including: contacting a nucleic acid-containing sample with a modifying agent that modifies 
unmethylated cytosine to produce a converted nucleic acid; amplifying the converted nucleic acid 
by means of two oligonucleotide primers in the presence of a specific oligonucleotide 
hybridization probe, wherein both the primers and probe distinguish between modified 

10 unmethylated and methylated nucleic acid; and detecting the "methylated" nucleic acid based on 
amplification-mediated probe displacement. 

The term "modifies" as used herein means the conversion of an unmethylated cytosine to 
another nucleotide by the modifying agent, said conversion distinguishing unmethylated from 
methylated cytosine in the original nucleic acid sample. Preferably, the agent modifies 

15 unmethylated cytosine to uracil Preferably, the agent used for modifying unmethylated cytosine 
is sodium bisulfite, however, other equivalent modifying agents that selectively modify 
unmethylated cytosine, but not methylated cytosine, can be substituted in the method of the 
invention. Sodium-bisulfite readily reacts with the 5, 6-double bond of cytosine, but not with 
methylated cytosine, to produce a sulfonated cytosine intermediate that undergoes deamination 

20 under alkaline conditions to produce uracil. Because Taq polymerase recognizes uracil as thymine 
and 5-methylcytidine ( C) as cytidine, the sequential combination of sodium bisulfite treatment 
and PCR amplification results in the ultimate conversion of unmethylated cytosine residues to 
thymine (C ->U T) and methylated cytosine residues (" m C") to cytosine ( m C m C C). 
Thus, sodium-bisulfite treatment of genomic DNA creates methylation-dependent sequence 

25 differences by converting unmethylated cyotsines to uracil, and upon PCR the resultant product 
contains cytosine only at positions where methylated cytosine occurs in the unmodified nucleic 
acid. 

Oligonucleotide "primers," as used herein, means linear, single-stranded, oligomeric 
deoxyribonucleic or ribonucleic acid molecules capable of sequence-specific hybridization 

30 (annealing) with complementary strands of modified or unmodified nucleic acid. As used herein, 
the specific primers are preferably DNA. The primers of the invention embrace oligonucleotides 
of appropriate sequence and sufficient length so as to provide for specific and efficient initiation of 
polymerization (primer extension) during the amplification process. As used in the inventive 
processes, oligonucleotide primers typically contain 12-30 nucleotides or more, although may 

35 contain fewer nucleotides. Preferably, the primers contain from 18-30 nucleotides. The exact 
length will depend on multiple factors including temperature (during amplification), buffer, and 
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nucleotide composition. Preferably, primers are single-stranded although double-stranded primers 
may be used if the strands are first separated. Primers may be prepared using any suitable method, 
such as conventional phosphotriester and phosphodiester methods or automated embodiments 
which are commonly known in the art. 

As used in the inventive embodiments herein, the specific primers are preferably designed 
to be substantially complementary to each strand of the genomic locus of interest. Typically, one 
primer is complementary to the negative (-) strand of the locus (the "lower" strand of a 
horizontally situated double-stranded DNA molecule) and the other is complementary to the 
positve (+) strand Cupper" strand). As used in the embodiment of Application D, the primers are 
preferably designed to overlap potential sites of DNA methylation (CpG nucleotides) and 
specifically distinguish modified unmethylated from methylated DNA. Preferably, this sequence 
discrimination is based upon the differential annealing temperatures of perfectly matched, versus 
mismatched oligonucleotides. In the embodiment of Application D, primers are typically 
designed to overlap from one to several CpG sequences. Preferably, they are designed to overlap 
from 1 to 5 CpG sequences, and most preferably from 1 to 4 CpG sequences. By contrast, in a 
quantitative embodiment of the invention employed in the Examples of the present invention, the 
primers do not overlap any CpG sequences. 

In the case of fully "unmethylated" (complementary to modified unmethylated nucleic acid 
strands) primer sets, the anti-sense primers contain adenosine residues ("A"s) in place of 
guanosine residues ("G"s) in the corresponding (-) strand sequence. These substituted As in the 
anti-sense primer will be complementary to the uracil and thymidine residues ("Us" and "Ts") in 
the corresponding (+) strand region resulting from bisulfite modification of unmethylated C 
residues ("Cs") and subsequent amplification. The sense primers, in this case, are preferably 
designed to be complementary to anti-sense primer extension products, and contain Ts in place of 
unmethylated Cs in the corresponding (+) strand sequence. These substituted Ts in the sense 
primer will be complementary to the As, incorporated in the anti-sense primer extension products 
at positions complementary to modified Cs (Us) in the original (+) strand. 

In the case of fully-methylated primers (complementary to methylated CpG-containing 
nucleic acid strands), the anti-sense primers will not contain As in place of Gs in the 
corresponding (-) strand sequence that are complementary to methylated Cs (i.e., m CpG 
sequences) in the original (+) strand. Similarly, the sense primers in this case will not contain Ts 
in place of methylated Cs in the corresponding (+) strand m CpG sequences. However, Cs that are 
not in CpG sequences in regions covered by the fully-methylated primers, and are not methylated, 
will be represented in the fully-methylated primer set as described above for unmethylated 
primers. 

Preferably, as employed in the embodiment of process D, the amplification process 

21 



provides for amplifying bisulfite converted nucleic acid by means of two oligonucleotide primers 
in the presence of a specific oligonucleotide hybridization probe. Both the primers and probe 
distinguish between modified unmethylated and methylated nucleic acid. Moreover, detecting the 
"methylated" nucleic acid is based upon amplification-mediated probe fluorescence. In one 
5 embodiment, the fluorescence is generated by probe degradation by 5' to 3' exonuclease activity 
of the polymerase enzyme. In another embodiment, the fluorescence is generated by fluorescence 
energy transfer effects between two adjacent hybridizing probes (Lightcycler® technology) or 
between a hybridizing probe and a primer. In another embodiment, the fluorescence is generated 
by the primer itself (Sunrise® technology). Preferably, the amplification process is an enzymatic 

10 chain reaction that uses the oligonucleotide primers to produce exponential quantities of 
amplification product, from a target locus, relative to the number of reaction steps involved. 

As describe above, one member of a primer set is complementary to the (-) strand, while 
the other is complementary to the (+) strand. The primers are chosen to bracket the area of interest 
to be amplified; that is, the "amplicon." Hybridization of the primers to denatured target nucleic 

1 5 acid followed by primer extension with a DNA polymerase and nucleotides, results in synthesis of 
new nucleic acid strands corresponding to the amplicon. Preferably, the DNA polymerase is Taq 
polymerase, as commonly used in the art. Although equivalent polymerases with a 5' to 3 ? 
nuclease activity can be substituted. Because the new amplicon sequences are also templates for 
the primers and polymerase, repeated cycles of denaturing, primer annealing, and extension results 

20 in exponential production of the amplicon. The product of the chain reaction is a discrete nucleic 
acid duplex, corresponding to the amplicon sequence, with termini defined by the ends of the 
specific primers employed. Preferably the amplification method used is that of PCR (Mullis et al., 
Cold Spring Harb. Symp. Quant Biol 51:263-273; Gibbs, Anal Chem. 62:1202-1214, 1990), or 
more preferably, automated embodiments thereof which are commonly known in the art. 

25 Preferably, methylation-dependent sequence differences are detected by methods based on 

fluorescence-based quantitative PCR (real-time quantitative PCR, Heid et al, Genome Res. 6:986- 
994, 1996; Gibson et al, Genome Res. 6:995-1001, 1996) (e.g., "TaqMan®," "Lightcycler®," and 
"Sunrise®" technologies). For the TaqMan® and Lightcycler® technologies, the sequence 
discrimination can occur at either or both of two steps: (1) the amplification step, or (2) the 

30 fluorescence detection step. In the case of the "Sunrise®" technology, the amplification and 
fluorescent steps are the same. In the case of the FRET hybridization, probes format on the 
Lightcycler®, either or both of the FRET oligonucleotides can be used to distinguish the sequence 
difference. Most preferably the amplification process, as employed in all inventive embodiments 
herein, is that of fluorescence-based Real Time Quantitative PCR (Heid et al., Genome Res. 6:986- 

35 994, 1996) employing a dual-labeled fluorescent oligonucleotide probe (TaqMan® PCR, using an 
ABI Prism 7700 Sequence Detection System, Perkin Elmer Applied Biosystems, Foster City, 
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California). 

The "TaqMan®" PCR reaction uses a pair of amplification primers along with a 
nonextendible interrogating oligonucleotide, called a TaqMan® probe, that is designed to 
hybridize to a GC-rich sequence located between the forward and reverse (i.e., sense and anti- 

5 sense) primers. The TaqMan® probe further comprises a fluorescent "reporter moiety" and a 
"quencher moiety" covalently bound to linker moieties (e.g., phosphoramidites) attached to 
nucleotides of the TaqMan® oligonucleotide. Examples of suitable reporter and quencher 
molecules are: the 5' fluorescent reporter dyes 6FAM ("FAM"; 2,7 dimethoxy-4,5-dichloro-6- 
carboxy-fluorescein), and TET (6-carboxy-4,7,2',7'-tetrachloro fluorescein); and the 3' quencher 

1 0 dye TAMRA (6-carboxytetramethylrhodamine) (Livak et al., PCR Methods Appl 4:357-362, 
1995; Gibson et al, Genome Res. 6:995-1001; and 1996; Heid et al., Genome Res. 6:986-994, 
1996). 

One process for designing appropriate TaqMan® probes involves utilizing a software 
facilitating tool, such as "Primer Express" that can determine the variables of CpG island location 

15 within GC-rich sequences to provide for at least a 10°C melting temperature difference (relative to 
the primer melting temperatures) due to either specific sequence (tighter bonding of GC, relative 
to AT base pairs), or to primer length. 

The TaqMan® probe may or may not cover known CpG methylation sites, depending on 
the particular inventive process used. Preferably, in the embodiment of process D, the TaqMan® 

20 probe is designed to distinguish between modified unmethylated and methylated nucleic acid by 
overlapping from 1 to 5 CpG sequences. As described above for the fully unmethylated and fully 
methylated primer sets, TaqMan® probes may be designed to be complementary to either 
unmodified nucleic acid, or, by appropriate base substitutions, to bisulfite-modified sequences that 
were either fully unmethylated or fully methylated in the original, unmodified nucleic acid sample. 

25 Each oligonucleotide primer or probe in the TaqMan® PCR reaction can span anywhere 

from zero to many different CpG dinucleotides that each can result in two different sequence 
variations following bisulfite treatment ( m CpG, or UpG). For instance, if an oligonucleotide spans 
3 CpG dinucleotides, then the number of possible sequence variants arising in the genomic DNA 
is 2 = 8 different sequences. If the forward and reverse primer each span 3 CpGs and the probe 

30 oligonucleotide (or both oligonucleotides together in the case of the FRET format) spans another 
3, then the total number of sequence permutations becomes 8X8X8 = 512. In theory, one could 
design separate PCR reactions to quantitatively analyze the relative amounts of each of these 512 
sequence variants. In practice, a substantial amount of qualitative methylation information can be 
derived from the analysis of a much smaller number of sequence variants. Thus, in its most 

35 simple form, the inventive process can be performed by designing reactions for the fully 

methylated and the fully unmethylated variants that represent the most extreme sequence variants 
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in a hypothetical example. The ratio between these two reactions, or alternatively the ratio 
between the methylated reaction and a control reaction (process A), would provide a measure for 
the level of DNA methylation at this locus. 

Detection of methylation in the MethyLight™ embodiment of process D, as in other 
5 MethyLight™ embodiments herein, is based on amplification-mediated displacement of the probe. 
In theory, the process of probe displacement might be designed to leave the probe intact, or to 
result in probe digestion. Preferably, as used herein, displacement of the probe occurs by 
digestion of the probe during amplification. During the extension phase of the PCR cycle, the 
fluorescent hybridization probe is cleaved by the 5' to 3' nucleolytic activity of the DNA 

10 polymerase. On cleavage of the probe, the reporter moiety emission is no longer transferred 
efficiently to the quenching moiety, resulting in an increase of the reporter moiety fluorescent- 
emission spectrum at 518 nm. The fluorescent intensity of the quenching moiety (e.g., TAMRA), 
changes very little over the course of the PCR amplification. Several factors my influence the 
efficiency of TaqMan® PCR reactions including: magnesium and salt concentrations; reaction 

15 conditions (time and temperature); primer sequences; and PCR target size (Le. 9 amplicon size) and 
composition. Optimization of these factors to produce the optimum fluorescence intensity for a 
given genomic locus is obvious to one skilled in the art of PCR, and preferred conditions are 
further illustrated in the "Examples" herein. The amplicon may range in size from 50 to 8,000 
base pairs, or larger, but may be smaller. Typically, the amplicon is from 100 to 1000 base pairs, 

20 and preferably is from 100 to 500 base pairs. Preferably, the reactions are monitored in real time 
by performing PCR amplification using 96-well optical trays and caps, and using a sequence 
detector (ABI Prism) to allow measurement of the fluorescent spectra of all 96 wells of the 
thermal cycler continuously during the PCR amplification. Preferably, process D is run in 
combination with the process A to provide controls for the amount of input nucleic acid, and to 

25 normalize data from tray to tray. 

MethyLight™ Process C. The MethyLight™ process can be modified to avoid sequence 
discrimination at the PCR product detection level. Thus, in an additional qualitative process 
embodiment, just the primers are designed to cover CpG dinucleotides, and sequence 
discrimination occurs solely at the level of amplification. Preferably, the probe used in this 

30 embodiment is still a TaqMan® probe, but is designed so as not to overlap any CpG sequences 
present in the original, unmodified nucleic acid. The embodiment of process C represents a high- 
throughput, fluorescence-based real-time version of MSP technology, wherein a substantial 
improvement has been attained by reducing the time required for detection of methylated CpG 
sequences. Preferably, the reactions are monitored in real time by performing PCR amplification 

35 using 96-well optical trays and caps, and using a sequence detector (ABI Prism) to allow 

measurement of the fluorescent spectra of all 96 wells of the thermal cylcer continuously during 
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the PCR amplification. Preferably, process C is ran in combination with process A (below) to 
provide controls for the amount of input nucleic acid, and to normalize data from tray to tray. 

MethyLight™ Process B. In preferred embodiments of the present invention, the 
MethyLight™ process can be also be modified to avoid sequence discrimination at the PCR 
5 amplification level. In a quantitative process B embodiment, just the probe is designed to cover 
CpG dinucleotides, and sequence discrimination occurs solely at the level of probe hybridization. 
Preferably, TaqMan® probes are used. In this version, sequence variants resulting from the 
bisulfite conversion step are amplified with equal efficiency; as long as there is no inherent 
amplification bias (Warnecke et al., Nucleic Acids Res. 25:4422-4426, 1997). Design of separate 

10 probes for each of the different sequence variants associated with a particular methylation pattern 
(e.g., 2 3 =8 probes in the case of 3 CpGs) would allow a quantitative determination of the relative 
prevalence of each sequence permutation in the mixed pool of PCR products. Preferably, the 
reactions are monitored in real time by performing PCR amplification using 96-well optical trays 
and caps, and using a sequence detector (ABI Prism) to allow measurement of the fluorescent 

15 spectra of all 96 wells of the thermal cylcer continuously during the PCR amplification. 

Preferably, process B is run in combination with process A, below to provide controls for the 
amount of input nucleic acid, and to normalize data from tray to tray. 

MethyLight Process A. MethyLight process A does not, in itself, provide for 
methylated-CpG sequence discrimination at either the amplification or detection levels, but 

20 supports and validates the other three process applications by providing control reactions for the 
amount of input DNA, and to normalize data from tray to tray. Thus, if neither the primers, nor 
the probe overlie any CpG dinucleotides, then the reaction represents unbiased amplification and 
measurement of amplification using fluorescent-based quantitative real-time PCR serves as a 
control for the amount of input DNA. Preferably, process A not only lacks CpG dinucleotides in 

25 the primers and probe(s), but also does not contain any CpGs within the amplicon at all to avoid 
any differential effects of the bisulfite treatment on the amplification process. Preferably, the 
amplicon for process A is a region of DNA that is not frequently subject to copy number 
alterations, such as gene amplification or deletion. 

Results obtained with the qualitative MethyLight™ version (process embodiment "B" of 

30 the technology) are described in the Examples below. Dozens of human tumor samples have been 
analyzed using this technology with excellent results. 

Cancer Diagnostic and Prognostic Assays and Kits 

35 

Typically, diagnostic and/or prognostic assays of the present invention involve obtaining a 
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tissue sample from a test tissue, performing a methylation assay on DNA derived from the tissue 
sample to determine the associated methylation state, and making a diagnosis or prognosis based 
thereon. 

In preferred embodiments, diagnostic and prognostic cancer assays are based on 
determination of the methylation state of one or more of the disclosed 20 gene sequences (APC, 
ARF, CALCA, CDH1, CDKN2A, CDKN2B, ESR1, GSTP1, HIC1, MGMT, MLH1, MYOD1, RBI, 
TGFBR2, THBS1, TIMP3, CTNNB1, PTGS2, TYMS and M1HFR, or methylation-altered DNA 
sequence embodiments thereof), as defined herein by the oligomeric primers and probes 
corresponding to SEQ ID NOS:1-60, 64 and 65 (see TABLE II, below). SEQ ID NOS:61-63 
correspond to the ACTS "control" gene region used in the present analysis (see EXAMPLE 1, 
below). 

Additionally, other primers or probes corresponding to other sequence regions of the CpG 
islands associated with the APQ ARF, CALCA, CDH1, CDKN2A, CDKN2B, ESR1 9 GSTP1, HIC1, 
MGMT, MLH1, MYOD1, RBI, TGFBR2, THBS1, TIMP3, CTNNB1, PTGS2 and TYMS sequence 
regions used herein may be used, based on the fact that the methylation state of a portion of a 
given CpG island is generally representative of the island as a whole. 

Accordingly, the reagents required to perform one or more art-recognized methylation 
assays (including those described above) are combined with such primers and/or probes, or 
portions thereof, to determine the methylation state of CpG-containing nucleic acids. 

For example, the MethyLight™, Ms-SNuPE, MCA, COBRA, and MSP methylation assays 
could be used alone or in combination, along with primers or probes comprising the sequences of 
SEQ ID NOS:l-65, or portions thereof, to determine the methylation state of a CpG dinucleotide 
within one or more of the 20 gene sequence regions corresponding to APC, ARF, CALCA, CDH1, 
CDKN2A, CDKN2B, ESR1, GSTP1, H1CU MGMT, MLH1, AfYODl, RBI, TGFBR2, THBS1, 
TIMP3, CTNNB1, PTGS2, TYMS or MTHFR, or, in the case of 19 of these 20 sequence regions 
(i.e., for all but MTHFR), to other CpG island sequences associated with these sequences, where 
such other CpG island sequences associated with these 19 gene sequences are those contiguous 
sequences of genomic DNA that encompasses at least one nucleotide of one of these 19 gene 
sequence regions, and satisfy the criteria of having both a frequency of CpG dinucleotides 
corresponding to an Observed/Expected Ratio >0.6, and a GC Content >0.5. 

EXAMPLE 1 

CpG Island Hypermethylation Increased with the Progression of EAC 

This Example shows the results of an analysis of the methylation status of a panel of CpG 
islands associated with 19 different genes selected for their known involvement in carcinogenesis 
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or because they have been shown to be methylated in other tumors (see Table 1, and under 
"Definitions," above), and of one non-CpG island sequence (MTHFR control sequence), for a total 
of 20 gene loci. 

Quantitative methylation data of the 20 genes from a screen of 84 tissue specimens from 
5 31 patients with different stages of Barrett's esophagus and/or associated adenocarcinoma showed 
a general increase in the frequency and in the quantitative level of CpG island hypermethylation at 
progressively advanced stages of disease. Accordingly, genes were grouped into distinct classes 
by their methylation behavior, based on both frequency and level of hypermethylation in various 
tissues (Figure 1). 

10 

Materials and Methods 

Sample Collection and histopathologic examination. Multiple tissue samples (normal 

esophagus (NE) ? normal stomach (S), intestinal metaplasia (IM), dysplasia (DYS) and/or 
t adenocarcinoma (T)) from a total of 5 1 patients (range 39-86 years of age) with either 
E15 adenocarcinoma or IM as the most advanced stage of disease were collected. 
j£ The initial set of samples analyzed included biopsies from 31 patients which were 

fs collected fresh and subdivided such that a part of each specimen was immediately frozen in liquid 
5 nitrogen and also embedded in paraffin for histopathologic examination by a pathologist (K.W.). 

Normal esophageal tissue was collected from every patient 1 0 cm or more away from the diseased 
30 areas. Frozen section examination of the frozen tissues was performed if the diagnosis was 
£J uncertain. The site of origin of the cancers was classified as esophageal if the epicenter of the 
!| tumor was above the anatomic gastroesophageal junction, with the junction defined as the 
=f proximal margin of the gastric rugal folds. TNM staging was used to classify the stage of each 

adenocarcinoma. 

25 A second set of samples were obtained for a follow-up study of 20 cases. Two groups of 

IM samples were collected: patients that had only IM as the most advanced stage of disease (8 
patients), and patients that had IM with associated dysplasia/adenocarcinoma located in another 
region of the esophagus (12 patients). H&E slides (5-micron sections) for each sample were 
prepared and examined by a pathologist (K.W.) to verify and localize the IM tissue. Cases that 

30 showed any signs of dysplasia or adenocarcinoma in the paraffin block used for analysis were 
excluded from this follow-up study. The IM tissues were carefully microdissected away from 
other cell types from a 30-micron section adjacent to the 5-micron H&E section. All specimens 
were classified according to the highest grade histopathologic lesion present in that sample. 
Approval for this study was obtained from the Institutional Review Board of the University of 

3 5 Southern California Keck School of Medicine. 

Nucleic Acid Isolation. Genomic DNA was isolated from the frozen tissue biopsies by a 
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simplified proteinase K digestion method (Laird et al., Nucleic Acids Res, 19:4293, 1991). The 
DNA from the paraffin tissues was extracted in lysis buffer (100 mM Tris-HCl, pH 8; 10 mM 
EDTA; and lmg/ml Proteinase K) overnight at 50°C (Shibata et al., Am. J. Pathol 141:539-543, 
1992). 

5 Sodium Bisulfite Conversion. Sodium bisulfite conversion of genomic DNA was 

performed as previously described (Olek et al, Nucleic Acids Res. 24:5064-5066, 1996). The 
beads were incubated for 14 hours at 50°C to ensure complete conversion. Sodium bisulfite 
treatment converts unmethylated cytosines to uracil, while leaving methylated cytosine residues 
intact (Frommer et al., Proa Natl Acad. ScL USA 89:1827-31, 1992). 

10 MethyLight™ Analysis. After sodium bisulfite conversion, the methylation analysis was 

performed by the fluorescence-based, real-time PCR assay MethyLight™, as described herein, and 
as previously described (Eads et al., Cancer Res. 60:5021-5026, 2000; Eads et al., Cancer Res. 
59:2302-2306, 1999; Eads et al., Nucleic Acids Res. 28:E32, 2000). Two sets of primers and 
probes, designed specifically for bisulfite converted DNA, were used: a methylated set for the 

15 gene of interest and a reference set, beta-actin (ACTS) to normalize for input DNA. Specificity of 
the reactions for methylated DNA were confirmed separately using human sperm DNA (with very 
low levels of CpG island methylation) and Sssl (New England Biolabs)-treated sperm DNA 
(heavily methylated) as previously described (Eads et al., Cancer Res. 60:5021-5026, 2000). 
The percentage of fully methylated molecules at a specific locus was calculated by 

20 dividing the GENE/ACTB ratio of a sample by the GENE/ACTB ratio of &sl-treated sperm DNA 
and multiplying by 100. The abbreviation PMR (Percent of Methylated Reference) is used to 
indicate this measurement. The methylation analysis on the paraffin microdissected samples was 
performed following bisulfite treatment as described above by an investigator blind to the 
associated dysplasia status of the samples. 

25 TABLE II lists the MethyLight™ primer and probe sequences (SEQ ID NOs: 1 -65), based 

on Genbank sequence data (except for SEQ ED NOs:64 and 65, see below), used in the present 
methylation analysis. Three oligos were used in every reaction: two locus-specific PCR primers 
flanking an oligonucleotide probe with a 5' fluorescent reporter dye (6FAM) and a 3' quencher 
dye (TAMRA) (Livak et al, PCR Methods Appl. 4:357-362, 1995). The Genbank accession 

30 number for each sequence is listed with the corresponding PCR amplicon location within that 

sequence. The %GC content, CpG observed/expected value and CpG:GpC ratio of 200 base pairs 
encompassing the MethyLight amplicon are indicated for each gene. The reaction type is 
designated "M" for methylation reaction and "C" for control reaction. The bisulfite treated DNA 
strand (top ("T") or bottom ("B")) and amplicon orientation (parallel ("P") or antiparallel ("A")) is 

35 also indicated. All primer and probe sequences are listed in the 5' to 3' direction. The numbers in 
brackets after each primer or probe sequence correspond to the associated SEQ ID NOs. The 
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single asterisk (*) notes that there are two bases in our CDKN2A primers that differ from this 
GenBank sequence, since a preliminary high-throughput GenBank entry was the only available 
sequence at the time of applicants' primer design. The correct primers should be the following: 
forward, TGGAGTTTTCGGTTGATTGGTT (SEQ ID NO:64) and reverse, 
AACAACGCCCGCACCTCCT (SEQ ID NO:65). The bases differing from the GenBank 
sequences are underlined. The double asterisk (**) indicates that the start site is not well defined. 
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Reverse Primer 
Sequence (5 f -3 f ) 


5? 
Si 

P 


CCGAACCTCCA 
AAATCTCGA [23] 


TCCCCAAAACG 
AAACTAACGAC 
[26] 


CGAATAATCCA 

CCGTTAACCG 

[29] 


AAACTACGACG 
ACGAAACTCCA 
A [32] 


CTATCGCCGCCT 
CATCGT [34] 


AATTCCACCGCC 
CCAAAC [38] 


GTTTTGAGTTGG 
TTTTACGTTCGT 
T[41] 


& 

?> 
P 


TCCCCTATCCCA 
AACCCG [44] 


Probe Sequence (5'-3 T ) 


6FAM- 

CGACTCTAAACCCTACGC 
ACGCGAAA-TAMRA [24] 


6FAM- 

CGCCCACCCGACCTCGCA 
T-TAMRA [27] 


6FAM- 

TTAACGACACTCTTCCCTT 
CTTTCCCACG-TAMRA [30] 


6FAM-AAACCTCGCGACC 
TCCGAACCTTATAAAA- 
TAMRA [33] 


6FAM- 

CGCGACGTCAAACGCCA 
CTACG-TAMRA [36] 


6FAM- 

TTTCCGCCAAATATCTTTT 
CTTCTTCGCA-TAMRA [39] 


6FAM- 

ACGCCGCGCTCACCTCCC 
T-TAMRA [42] 


6FAM- 

CGCGCGTTTCCCGAACCG 
-TAMRA [45] 


Forward Primer 
Sequence (S'-S') 


ACGGGCGTTTT 
CGGTAGTT [22] 


AATTTTAGGTT 
AGAGGGTTATC 
GCGT [25] 


AGGAAGGAGAG 

AGTGCGTCG 

[28] 


GTCGGCGTCGT 

GATTTAGTATT 

G[31] 


CGTTATATATC 
GTTCGTAGTAT 
TCGTGTTT [35] 


CGGAAGCGTTC 
GGGTAAAG [37] 


CGACGCACCAA 
CCTACCG [40] 


GGAAAGGCGC 
GTCGAGT [43] 
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Reverse Primer 
Sequence (5'-3') 


ACTAAACGCCG 
CGTCCAA [47] 


CAAACCCCGCT 
ACTCGTCAT [50] 


TCTCAAACTATA 

ACGCGCCTACA 

T[53] 




CCGAACGCC1C 
CATCGTAT [59] 


i 
i 


CGCCTCATCTTC 
TCCCGA [56] 




AACCAATAAAA 
CCTACTCCTCCC 
TTAA 62] 


Probe Sequence (5'-3') 


6FAM- 

TCACGTCCGCGAAACTCC 
CGA-TAMRA [48] 


6FAM- 

CACGAACGACGCCTTCCC 
GAA-TAMRA [51] 


6FAM-CCGAATACCGACA 
AAATACCGATACCCGT- 
TAMRA [54] 




6FAM-CAACATCGTCTAC 
CCAACACACTCTCCTACG- 
TAMRA [60] 




6FAM- 

TCTCATACCGCTCAAAAT 
CCAAACCCG-TAMRA [57] 




6FAM-ACCACCACCCAAC 
ACACAATAACAAACACA- 
TAMRA [63] 


Forward Primer 
Sequence (5'-3') 


TTAGTTCGCGT 
ATCGATTAGCG 
[46] 


GCGCGGAGCG 
TAGTTAGG [49] 


CGGCGTTAGGA 
AGGACGAT [52] 




GTTAGGCGGTT 
AGGGCGTC [58] 




TGGTAGTGAGA 
GTTTTAAAGAT 
AGTTCGA [55] 




TGGTGATGGAG 
GAGGTTTAGTA 
AGT [61] 
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Statistics. The PMR values obtained by MethyLight (see above) were "dichotomized" 
at 4 PMR for statistical purposes as described previously (Eads et al., Cancer Res. 60:5021-5026, 
2000. Dichotomization facilitates graphical representation, and moderates the quantitative impact 
of gene loci with different levels of hypermethylation, resulting in a more reliable cross-gene 
5 comparison of hypermethylation frequencies. Specifically, dichotomization equalizes the 

quantitative impact of methylated genes within each class (see "Epigenetic gene classes," below), 
simplifying cross-gene comparisons of methylation frequencies. 

A dichotomization point of 4 PMR was selected because it gave the best discrimination 
between normal and malignant tissues, across the board for all CpG islands (Eads et al, Cancer 

10 Res. 60:5021-5026, 2000). However, the precise dichotomization point does not significantly 

affect the statistics or alter the conclusions, and other dichotomization points are within the scope 
of the present invention (see below). 

Accordingly, samples containing 4 PMR or higher were designated as methylated and 
given a value of 1, while samples containing less than 4 PMR were designated as unmethylated 

1 5 and given a value of 0. The cumulative value of genes methylated in each class (see Epigentic 
gene classes" A-G, herein below), or for all 19 genes was then used as a continuous variable in a 
Fisher's Protected Least Significant Difference test, adapted for use with unequal sample sizes 
(SAS Statview software) to obtain /^-values. The different parameters such as tissue type, 
presence of associated dysplasia, tumor stage, etc., were used as the nominal variables. The EM 

20 samples in the above-mentioned "follow-up" study of hypermethylation in IM, and the presence of 
associated dysplasia and/or carcinoma, were further dichotomized at 1 or fewer, versus two or 
more Class A genes methylated. A Fisher's exact test was then used to determine statistical 
significance. 

25 Results 

CpG Island Hypermethylation and the Progression of EAC. The methylation status of a 
panel of CpG islands associated with 19 different genes and of one non-CpG island sequence for a 
total of 20 gene loci, was analyzed by the quantitative, high-throughput MethyLight™ assay (Eads 
et al., Cancer Res. 59:2302-2306, 1999; Eads et al., Nucleic Acids Res. 28:E32, 2000). The 

30 efficiencies of the methylation reactions were controlled for in each analysis by including 

unmethylated control DNA and methylated control DNA (Eads et al., Cancer Res., 60:502 1-5026, 
2000). The 20 genes were selected for their known involvement in carcinogenesis or because they 
have been shown to be methylated in other tumors (see Table 1, and under "Definitions," above). 
We included a region located in the MTHFR gene as a "non-CpG island" control for a single copy 

35 sequence that does not satisfy the criteria (see "Definitions," above) of a CpG island. CpG 

dinucleotides outside of an island are presumably normally methylated, unlike CpG dinucleotides 
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within CpG islands. 

Figure 1 illustrates the quantitative methylation data of the 20 genes from our screen of 84 
tissue specimens from 31 patients with different stages of Barrett's esophagus and/or associated 
adenocarcinoma. Methylation analysis was performed using the MethyLight assay (Eads et al. ? 

5 Cancer Res. 59:2302-2306, 1999; Eads et al, Nucleic Acids Res. 28:E32, 2000). The percentage 
of fully methylated molecules at a specific locus (PMR = Percent of Methylated Reference) was 
calculated by dividing the GENE/ACTB ratio of a sample by the GENE/ACTB ratio of &sl-treated 
sperm DNA and multipling by 100. The resulting percentages were then dichotomized at 4% 
PMR to facilitate graphical representation and to reveal tissue-specific patterns. The various 

10 squares, each having one of four possible shading intensity levels (see bottom axis of Figure 1), 
designate samples with less than 4 PMR, 4 - 20 PMR, 21-50 PMR and more than 51 PMR, where 
progressively increasing shading intensity levels correspond to progressively higher PMR values. 
The tissue types are shown on the left. The TNM tumor staging is designated by "1", "2", "3" and 
"4". The occurrence of distally located dysplasia and/or adenocarcinoma in the patient is 

1 5 indicated at the right of the figure by "YES" if present and "NO" if absent. "N" indicates an 

analysis for which the control gene ACTB did not reach sufficient levels to allow the detection of a 
minimal value of 1 PMR for that methylation reaction in that particular sample. 

There was a general increase in the frequency and in the quantitative level of CpG island 
hypermethylation at progressively advanced stages of disease. However, the propensity for 

20 aberrant methylation of the genes was not uniform. Genes differed both in their frequency and in 
their levels of hypermethylation in various tissues. 

Therefore, according to the present invention, genes can be grouped into classes based on 
their methylation behavior (Classes A-G, as shown at the right of Figure 1). This allowed for a 
visual assessment of concordant methylation of the different genes during various stages of 

25 turmorigenesis. A rationale for each of the gene classes is presented in the following section. 

Epigenetic Gene Classes. The analysis of combined behavior of genes with different 
levels of DNA methylation would, without appropriate data treatment, be expected to lead to a 
bias of the group behavior towards genes with quantitatively high levels of DNA methylation. For 

30 instance, the mean values for gene "Class B" for most of the tumor samples would be driven 
primarily by the TIMP3 values, since this gene tended to have higher levels of methylation than 
the other two genes in this group {see Figure 1). 

Therefore, the methylation values used to generate Figure 1 were collapsed into a binary 
variable with a dichotomization point of 4 PMR to equalize the quantitative impact of methylated 

35 genes within each epigenetic class. Samples containing 4 PMR or higher were designated as 

methylated and given a value of 1, while samples containing less than 4 PMR were designated as 
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unmethylated and given a value of 0 (see "Statistics" above, under "Materials and Methods"). 
This dichotomization moderates the effect of highly methylated genes, simplifies cross-gene 
comparisons of methylation frequencies, as shown in Figure 2, and allowed the calculation of 
class averages of methylation frequencies as shown in Figure 3 (below). 
5 Figure 2 shows the percent of samples methylated for each gene by tissue type. The data 

was dichotomized at 4 PMR, with 4 PMR and higher designated as methylated, and below 4 PMR 
as unmethylated. The genes, according to the present invention, were grouped according to their 
respective epigenetic gene classes (A-G) as shown in Figure 1. The letter "n" equals the number 
of samples analyzed for each tissue. 

1 0 The suitability of the 4 PMR dichotomization point was based on its ability to discriminate 

between the different tissue types, as shown in Figures 1-3 (see also Klump et al., 
Gastroenterology. 115:1381-1386,1998). Other dichotomization point values are within the 
scope of the present invention, where such dichotomization point values moderate the statistical 
effects of highly methylated genes, simplify cross-gene comparisons of methylation frequencies, 

15 and facilitate calculation of class averages of methylation frequencies. For instance, there is still a 
statistically significant difference in the mean percent of genes methylated (out of 19 genes) 
between the normal esophageal mucosa and the IM (p = 0.0003), DYS (p < 0.0001) and T (p < 
0.0001) tissues when the data is dichotomized at 10 PMR. 

Additionally, all of the statistically significant findings of the NE and IM methylation 

20 frequency with or without associated dysplasia (see Example 3, below) remain significant at a 
dichotomization point of 10 PMR, instead of 4 PMR. It is important to note that 4 PMR is not 
comparable to a 4% methylation level of a single CpG dinucleotide. Rather, it indicates that in 
this sample, 4% of the DNA molecules had complete methylation at all CpG dinucleotides covered 
by the three MethyLight™ primers (usually about 8 CpGs). The nature of the MethyLight™ 

25 assay is such that it is oblivious to all other methylation patterns that may be present (Eads et al., 
Nucleic Acids Res. 28:E32, 2000). 

Therefore, 4 PMR is likely to represent a higher mean level of methylation than 4%. The 
extensively methylated molecules that are assayed by MethyLight™ are likely to represent alleles 
that have been completely silenced by CpG island hypermethylation, although this was not 

3 0 investigated herein. 

Of the panel of 20 genes, the most informative genes were those with an intermediate 
frequency of hypermethylation (ranging from 15% (CDKN2A) to 60% (MGMT) of the sample 
values above the 4 PMR methylation cutoff). This group was further subdivided into three 
epigenetic gene classes according to the absence (Class "A") or presence (Class "B") of 

35 methylation in normal esophageal mucosa and stomach, or the infrequent methylation of normal 
esophageal mucosa accompanied by methylation in all normal stomach samples (Class "C"). The 
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other genes were less informative, since the incidence of hypermethylation was either very 

infrequent (Class "D"), completely absent (Class "E"), or ubiquitous (Classes "F" and "G") 

regardless of tissue type (Figures 1, 2 and 3). 

Epigenetic gene Class A comprises the genes CDKN2A, ESR1 and MYOD1 (Figures 1, 2 
5 and 3). There was a statistically significant difference in the methylation frequency of ESR1 (p = 

0.0001) and MYOD1 (p = 0.0038) of normal esophagus (NE), as compared to IM tissue, but not 

for CDKN2A (p = 0.097). The frequency of CDKN2A methylation increased significantly in the 

more advanced stages of the adenocarcinoma (T) (p < 0.0001). 

Epigenetic gene Class B comprises the genes CALCA, MGMT and TIMP3. In contrast to 
10 Class A, this class exhibited methylation in the normal esophageal mucosa (NE) and stomach (S) 

tissue (Figures 1 and 2). Only TIMP3 showed a significant difference in methylation frequency 

between the NE and IM values (p = 0.0074). 

Epigenetic gene Class C comprises the gene APC which was, in contrast to genes of 

Classes A and B, methylated in all normal stomach samples (Figures 1 and 2). This confirms 
15 previous documentation of APC methylation in normal stomach tissue (Eads et al. 3 Cancer Res. 

60:5021-5026, 2000). The mechanism which protects APC from methylation in the normal 

esophageal tissues (NE) but not in normal stomach tissues (S) is not clear. 

Epigenetic gene Class D comprises the genes ARF, CDH1, CDKN2B, GSTP1, MLHl f 

PTGS2 and THBS1, which were infrequently methylated (Figures 1 and 2). There was a slight 
20 increase in the frequency of this class of genes in adenocarcinoma (T), but this did not approach 

statistical significance (Figure 3). Interestingly, with the exception of PTGS2, which has not yet 

been investigated in other systems, the remaining Class D genes are frequently hypermethylated in 

other tumor types (Table 2). 

Epigenetic gene Class E comprises the CTNNB1, RBI, TGFBR2 and TYMS1 genes, which 
25 were unmethylated at each stage in the progression of EAC. Similar to most Class D genes, RBI 

and TGFBR2 have been found to be hypermethylated in other tumors types {see Table 1, and 

literature references under "DEFINITIONS" herein above). It should be noted that all samples 

scored postitive for DNA input as measured by the control gene (ACTS). Therefore, the lack of 

detectable DNA methylation cannot be attributed to a lack of input DNA. The control reaction 
30 was sufficient in each sample, so that a level as low as 1 PMR for a given test gene could be 

detected. The integrity and specificity of all methylation reactions was confirmed using in vitro 

methylated human DNA. 

The epigenetic Class F comprises the HIC1 gene, which was completely methylated, 

regardless of tissue type (Figures 1 and 2). HIC1 is commonly methylated in other types of 
35 cancers (Jones & Laird, Nat Genet. 21:163-167, 1999; Baylin & Herman, Trends Genet. 16:168- 

174, 2000), and has been shown to be methylated in normal breast ductal tissue and bone marrow 
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samples of breast cancer and AML patients, respectively (Melki et al., Cancer Res. 59:3730-3740, 
1999; Fujii et al., Oncogene. 16:2159-2164, 1998). Nevertheless, the finding of ubiquitous 
methylation of a CpG island in normal tissues was unexpected. Therefore, the validity of the 
HIC1 MethyLight™ results was confirmed using a different technique (Hpall-PCR) (Singer-Sam 
et al, Nucleic Acids Res. 18:687, 1990). 

Epigenetic Class G comprises the non-CpG island MTHFR gene, used herein as a control. 
Interestingly, the ubiquitous HIC1 methylation pattern is similar to the non-CpG island MTHFR 
control (Class G), however the percentage of methylated molecules was quantitatively higher for 
HIC1 (Figure 1). 



Epigenetic Profiles of EAC Progression. Each tissue type showed a unique epigenetic 
profile or fingerprint that changed during disease progression (Figure 3, upper panel). 

Figure 3 shows a comparison of epigenetic profiles according to the present invention. 
The data was dichotomized at 4 PMR, with 4 PMR and higher designated as methylated, and 

15 below 4 PMR as unmethylated. Error bars represent the standard error of the mean. Upper panel: 
Mean percent of genes methylated in each gene Class (A-F or ALL 19 CpG islands) by tissue type 
(N, normal esophagus; S, stomach; IM, intestinal metaplasia; DYS, dysplasia; T, 
adenocarcinoma). The error bars represent the standard error of the mean (SEM). Lower panel: 
Statistical analysis of the difference in mean percent of genes methylated in different tissues by 

20 gene Class (A-F) or for all 19 CpG islands combined (ALL). The ^-values were generated by a 
Fisher's Protected Least Significant Difference (PLSD) test, adapted for use with unequal sample 
numbers (SAS Statview™ software). 

Classes A, B and C were methylated at a significantly higher frequency in IM tissue than 
in normal esophageal mucosa (NE) (Figure 3, upper and lower panels). Furthermore, the 

25 transition from IM to dysplasia (DYS) or malignancy (T) was associated with an additional 
increase in Class A methylation (Figure 3, upper and lower panels). The lack of a significant 
difference between dysplasia and adenocarcinoma for any of the gene classes or when all 19 genes 
are combined (Figure 3, upper and lower panels) suggests that most of these abnormal epigenetic 
alterations occur early in the progression of EAC. 

30 

In summary of this Example. According to the present invention, quantitative methylation 
data of 20 genes (Tables I and II, above) from a screen of 84 tissue specimens from 31 patients 
with different stages of Barrett's esophagus and/or associated adenocarcinoma showed a general 
increase in the frequency and in the quantitative level of CpG island hypermethylation at 
35 progressively advanced stages of disease (Figures 1-3, above). 

Additionally, genes were grouped into novel epigenetic classes based on their methylation 
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behavior (Classes A-G, as shown herein in Figures 1-3) during tumor progression. This allowed 
for graphical representation of concordant methylation of the different genes during various stages 
of turmorigenesis, which can be readily appreciated by means of a simple visual assessment. 
Each tissue type showed a unique epigenetic profile or fingerprint that changed during 
5 disease progression (Figure 3, upper panel). Classes A, B and C were methylated at a significantly 
higher frequency in IM tissue than in normal esophageal mucosa (NE) (Figure 3, upper and lower 
panels). Furthermore, the transition from IM to dysplasia (DYS) or malignancy (T) was 
associated with an additional increase in Class A methylation (Figure 3, upper and lower panels). 

10 EXAMPLE 2 

Hypermethylation was Reflective of EAC Tumor Grade and Stage 

This Example examines whether the grade or stage of an esophageal adenocarcinoma 
correlates with a higher frequency of CpG island hypermethylation. According to the present 
invention, for EAC, epigenetic Class A gene methylation is significantly higher in stage II, III and 

15 IV tumors relative to less advanced stage I tumors (Figure 4). 

Materials and Methods 

TNM staging. The American Joint Committee on Cancer ("AJCC") has designated staging 
by TNM classification (Tumor; lymph Node metastasis, distant Metastasis). TNM staging was 
20 used to classify the stage of each esophageal adenocarcinoma from the tissues of Example 1. 

Methylation and statistical analysis. Methylation and statistical analysis was as described 
herein under Example 1 . 

Results 

25 Methylation of epigenetic Class A genes increases with tumor stage. Moderately 

differentiated tumors have significantly less frequent Class A methylation compared to poorly 
differentiated tumors (p = 0.045). Additionally, Figure 4 (upper and lower panels) shows that 
there is a significantly higher mean number of Class A genes methylated in stage II, III and IV 
tumors relative to less advanced stage I tumors. The differences between stage I tumors and stage 

30 II, III and IV tumors did not reach statistical significance for any of the other epigenetic gene 
classes. 

Figure 4 shows the relationship between Class A methylation frequency and tumor stage 
according to the present invention. The data was dichotomized at 4 PMR, with 4 PMR and higher 
designated as methylated, and below 4 PMR as unmethylated. Upper panel: Mean number of 
35 genes methylated for Class A with respect to tumor stage (I-IV) is shown (see Figure 1). The error 
bars represent the standard error of the mean (SEM). The letter "n" equals the number of samples 
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analyzed in each tumor stage. Lower panel: Statistical analysis of the difference in mean number 
of Class A genes methylated by tumor stage. The js-values were generated by a Fisher's Protected 
Least Significant Difference (PLSD) test, adapted for use with unequal sample numbers (SAS 
Statview™ software). 

In summary for this Example. According to the present invention, in addition to the 
epigenetic profiles or fingerprints (comprising the gene classes disclosed herein) that can be used 
to assess oncogenic progression, the mean number of methylated Class A genes can be used to 
assess the relative stages of EAC tumors. 



EXAMPLE 3 

Methylation of Premalignant Tissues With or Without Associated Dysplasia 

This Example shows that the frequency of Class B methylation in the normal esophagus 
(NE) was found to be significantly higher in patients with associated dysplasia/tumor (p = 0.0037) 
15 (Figure 1). Additionally, Class A methylation was found to be more frequent in IM samples from 
patients with concurrent dysplasia or cancer, than in IM samples from patients without any 
evidence of further progression (p < 0.0001) (Figures 1 and 5). That is, there was a significant 
positive association between hypermethylation of epigenetic Class A genes in IM tissue, and the 
presence of associated dysplasia or cancer (Figure 5). 

20 

Materials and Methods 

Histopathology. Histopathological classification was as described under "Materials and 
Methods," Example I above. 

Methylation and statistical analysis. Methylation and statistical analysis was as described 
25 herein under Example 1 . 



Results 

Methylation of Premalignant Tissues with or without Associated Dysplasia. The 
occurrence, according to the present ivention, of CpG island hypermethylation in some cases of 

30 IM for Class A and some cases of normal esophageal mucosa for Class B raised the question 
whether these methylation events represent normal methylation patterns in these non-dysplastic 
tissues, or whether they reflect methylation changes that predispose cells to further progression. In 
the latter case, one would expect to find a higher frequency of such CpG island hypermethylation 
in these tissues in patients who have already undergone further disease progression. Therefore, the 

35 frequency of such CpG island hypermethylation was compared between tissues (of the present 
study) with or without associated dysplasia. 
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In the initial study, patients were divided based on whether or not they had Barrett's 
esophagus (IM) as their most advanced stage of disease (Figure 1, "NO") or whether they had 
associated dysplasia and/or adenocarcinoma present in a different region of the esophagus (Figure 
1, "YES"). The frequency of Class B methylation in the normal esophagus (NE) was indeed 
5 found to be significantly higher in patients with associated dysplasia/tumor (p = 0.0037) (Figure 
1). Additionally, Class A methylation was found to be more frequent in IM samples from patients 
with concurrent dysplasia or cancer, than in IM samples from patients without any evidence of 
further progression (p < 0.0001) (Figure 1). 

A potential criticism of this analysis is that the same set of samples was used to delineate 
10 the class of genes, as was used to test the association with a clinical parameter. Therefore, a 

follow-up study of 20 additional cases of IM was performed entirely independent of the first data 
set. 

In the follow-up study of 20 cases, two groups of IM samples were collected: patients that 
had only IM as the most advanced stage of disease (8 patients), and patients that had IM with 

15 associated dysplasia/adenocarcinoma located in another region of the esophagus (12 patients). 
H&E slides (5-micron sections) for each sample were prepared and examined by a pathologist 
(K.W.) to verify and localize the IM tissue. Cases that showed any signs of dysplasia or 
adenocarcinoma in the paraffin block used for analysis were excluded from this follow-up study. 
The IM tissues were carefully microdissected away from other cell types from a 30-micron section 

20 adjacent to the 5-micron H&E section. All specimens were classified according to the highest 
grade histopathologic lesion present in that sample. 

The initial study had revealed that all IM samples associated with further disease 
progression ("YES") had at least two Class A genes methylated, while all IM samples without 
associated dysplasia or adenocarcinoma ("NO") did not show any methylation of Class A genes 

25 (Figure 1, under "Barrett's (IM)"). Therefore, a state of having two or more Class A genes 

methylated was defined as an indicator of increased risk for the presence of associated dysplasia or 
cancer. 

The data from our first series gave a /?-value of 0.0048 in a Fisher's exact test of this 
association (Figure 5, left panel). The follow-up series of 20 independent cases gave a jp-value of 
30 0.01 8 (Figure 5, right panel). 

Figure 5 shows the percent of two or more Class A genes methylated in intestinal 
metaplasia ("IM") tissues with ("Y"), or without ("N") associated dysplasia and/or 
adenocarcinoma. The data was dichotomized at 4 PMR, with 4 PMR and higher designated as 
methylated, and below 4 PMR as unmethylated. Left panel: Class A methylation in the IM data 
35 illustrated in Figure 1. Right panel: Class A methylation in the IM for a completely independent 
follow-up study of twenty different microdissected IM samples. The error bars represent the 
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standard error of the mean (SEM). The letter "n" equals the number of samples analyzed in each 
tissue group. 

Therefore, the positive association between hypermethylation of Class A genes and the 
presence of associated dysplasia or cancer is significant. It should be noted that the IM samples 
5 without associated dysplasia in this follow-up study (Figure 5, right panel) showed a low 

frequency of samples with at least two genes methylated, which is in contrast to the absence of 
methylation in the first study (Figure 1, and Figure 5, left panel). This may be attributed to the 
fact that the samples in the second series were microdissected from paraffin sections. Therefore, 
there is a lower background of unmethylated stromal cells in the sample. In this case, the 

10 methylation signal is not as diluted by other normal cells and consequently the ratio of methylated 
molecules to total DNA may rise above the 4 PMR threshold. Alternatively, dysplastic or 
malignant tissue may have been missed during the endoscopic survey in some of the cases scored 
as free of further disease progression due to the sampling limitations of endoscopy. This is a well- 
documented problem in the detection of esophageal adenocarcinoma (Peters et al., J. Thorac. 

15 Cardiovasc. Surg. 108:813-821, 1994). 

EXAMPLE 4 

No Clear Evidence of CpG Island Methylator Phenotype ("CIMP") for EAC 

This Example shows that, for the present study of EAC, there was no clear evidence of a 

20 separate group of CIMP tumors, as has been previously defined for colorectal and gastric cancer 
(Toyota et al., Proa Natl Acad. Sci. USA. 96:8681-8686, 1999; Toyota et al., Cancer Res. 
59:5438-5442, 1999). However, CpG island hypermethylation in EAC did occur across multiple 
loci in a given sample. Furthermore, the number of loci hypermethylated in a single sample 
increased as the disease progressed through different histological stages (Figure 6). The bimodal 

25 distributions seen in IM tissues (Figure 6) can be fully attributed to the concurrent association with 
dysplasia or cancer described herein above. 

Materials and Methods 

Histopathology. Histopathological classification was as described under "Materials and 
30 Methods," Example I above. 

Methylation and statistical analysis. Methylation and statistical analysis was as described 
herein under Example 1. 

Results 

35 CIMP Analysis. It has previously been reported that a subset of colorectal and gastric 

tumors display a CpG island methylator phenotype ("CIMP"), characterized by widespread, 
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aberrant hypermethylation changes affecting multiple loci in a single tumor (Toyota et al, Proa 
Natl Acad. Set USA 96:8681-8686, 1999; Toyota et al., Cancer Res. 59:5438-5442, 1999). This 
is reflected in a bimodal distribution of the frequency of the number of genes methylated in a 
group of tumors (Toyota et al., Proc. Natl Acad. Set USA 96:8681-8686, 1999). CIMP tumors 
5 are a distinct group of tumors that are defined by a high degree of concordant CpG island 
hypermethylation of genes exclusively methylated in cancer, or "type-C" genes. CIMP is 
currently thought to be a new, distinct, yet major pathway of tumorigenesis (Toyota et al., Proc. 
Natl Acad. Sci. USA 96:8681-8686, 1999; Toyota et ah, Cancer Res. 59:5438-5442, 1999). 

Therefore the question of whether esophageal adenocarcinoma tumors exhibit a CpG 
1 0 island methylator phenotype (CIMP) was investigated. 

Class A genes of the present invention most closely exemplify the "type-C" genes, because 
they lack methylation in the normal tissues. The distribution of the number of Class A genes 
methylated was examined for EAC (Figure 6). 

Figure 6 shows, according to the present invention, methylation frequency distributions in 
15 the progression of esophageal adenocarcinoma. The data was dichotomized at 4 PMR, with 4 

PMR and higher designated as methylated, and below 4 PMR as unmethylated. The proportion of 
patients with zero to three (Class A), zero to nine (Classes A + D) and zero to fourteen CpG 
islands (Classes A + B +C + D) methylated in each tissue is shown. Class E and F CpG islands 
were not included since there was no variation in the frequency of methylation between the 
20 different tissue. The letter "n" equals the number of samples analyzed in each tissue. 

However, the frequency of genes methylated in the adenocarcinoma tissue did not show 
the expected bimodal distribution of CIMP (Figure 6) (Toyota et al., Cancer Res. 59:5438-5442, 
1 999). Similar results were observed when Class D genes, which also exhibit type C methylation, 
were included along with Class A (Figure 6, middle panel) and when Classes A, B, C and D genes 
25 were combined (Figure 6, right panel). Classes E and F genes were not included since they did 
not exhibit any methylation variation between the different tissue types. 

There was a single sample with 10 out of 14 Class A-D genes methylated (Figure 1, Case 
#3 and Figure 6). However, this sample only stands out when Class B genes, which are 
methylated in normal esophageal mucosa and therefore do not satisfy the definition of "type-C" 
30 genes that constitute the CIMP phenotype, are included. 

Therefore, there was no clear evidence of a separate group of CIMP tumors in the present 
study of esophageal adenocarcinoma, as has been previously defined for colorectal and gastric 
cancer. 

However CpG island hypermethylation in EAC did occur across multiple loci in a given 
35 sample. Furthermore, the number of loci hypermethylated in a single sample increased as the 

disease progressed through different histological stages (Figure 6). The bimodal distributions seen 
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in IM tissues (Figure 6) can be fully attributed to the concurrent association with dysplasia or 
cancer described herein above. 



EXAMPLE 5 

5 Array- and Microarray-based Applications 

Microarray-based embodiments are within the scope of the present invention. For 
example, one such array-based embodiment uses differential methylation hybridization ("DMH"), 
(Huang et al., Hum. Mol Genet, 8:459-470, 1999; Yan et al, Clin. Cancer Res. 6:1432-38, 2000). 
DMH is applied to screen paired test and normal samples and to determine whether patterns {see 

10 "Epigenetic patterns," herein under Example 1) of specific epigenetic alterations correlate with 
pathological parameters in the tissue samples analyzed. "Amplicons" (Id), representing a pool of 
methylated CpG DNA derived from these samples, are used as hybridization probes in an array 
panel containing the CpG island tags of the present invention. 

Accordingly, one or more of the CpG island sequences associated with 19 of the 20 

15 disclosed gene sequences (i.e., APC, ARF, CALCA, CDH1, CDKN2A, CDKN2B, ESR1, GSTP1, 
HIC1, MGMT, MLH1, MYOD1, RBI, TGFBR2, THBS1, TIMP3, CTNNB1, PTGS2 and TYMS 
(see TABLES I and II, above; and see under "Definitions," above), or methylation-altered DNA 
sequence embodiments thereof, can be used as CpG island tags in an array or microarray-based 
assay embodiment. These 19 gene sequence regions are defined herein by the oligomeric primers 

20 and probes corresponding to SEQ ID NOs: 1-54, 58-60, 64 and 65 (see TABLE II, above; SEQ ID 
NOs:61-63 correspond to the ACTB "control" gene region used in the present analysis (see 
EXAMPLE 1, below)). Associated CpG island sequences are (based on the fact that the 
methylation state of a portion of a given CpG island is generally representative of the island as a 
whole) those contiguous sequences of genomic DNA that encompass at least one nucleotide of the 

25 sequences defined by these specific oligonucleotide primers and probes, and satisfy the criteria of 
having both a frequency of CpG dinucleotides corresponding to an Observed/Expected Ratio >0.6, 
and a GC Content >0.5. 

These CpG island tags are then arrayed on solid supports (e.g., nylon membranes, silicon, 
etc.), and probed with amplicons representing a pool of methylated CpG DNA, from test (e.g., 

30 tumor) or reference samples. The differences in test and reference signal intensities on screened 
CpG island arrays reflect methylation alterations of corresponding sequences in the test DNA. 

Comparison of the resulting data with the epigenetic patterns disclosed herein allows for a 
diagnostic or prognostic determination. 

Therefore, according to this embodiment, pattern analysis (see working Examples 1-4, 

35 below) in a subset of CpG island tags, affixed to a solid support to form an array or microarray, is 
used to follow progression during various stages of cancer progression (e.g., gastrointestinal and 
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esophageal dysplasia, gastrointestinal and esophageal metaplasia, Barrett's esophagous, and pre- 
cancerous conditions in normal esophageal squamus mucosa), and can be used to determine 
histological grades or stages of tumors, such as esophageal adenocarcinoma. 

Other array or microarray embodiments of the present invention will be obvious to those of 

5 ordinary skill in the relevant art. Such embodiments include, but are not limited to those wherein 
the specific primers and/or probes for APC, ARF, CALCA, CDH1, CDKN2A, CDKN2B, ESR1, 
GSTP1, HIC1, MGMT, MLH1, MYOD1, RBI, TGFBR2, THBS1, TIMP3, CTNNB1, PTGS2 and 
TYMS {see TABLES I and II, above; and see under "Definitions," above), corresponding to SEQ 
ID NOs:l-54, 58-60, 64 and 65 {see TABLE II, above; SEQ ID NOs:61-63 correspond to the 

10 ACTB "control" gene region used in the present analysis {see EXAMPLE 1, above)) are arrayed 
on solid supports. 



DISCUSSION 

1 5 There is a need in the art for novel and more sensitive methods of cancer detection, 

chemoprediction and prognostics. There is a need in the art to define novel coordinate patterns of 
CpG island methylation changes {i.e., novel epigenetic patterns) at multiple loci during 
progression of a disease, such as cancer. There is a need in the art to determine tumor-type- 
specific, and patient-specific epigenetic patterns or fingerprints. There is a need in the art to 

20 provide biomarkers or probes, such as EAC-specific biomarkers or probes, that can be used in 
diagnostic and/or prognostic methods for the treatment of cancer. There is a need in the art to 
determine whether esophageal adenocarcinoma displays a CDVLP. There is a need in the art for 
novel methods for determining the stage of a tumor. The present invention addresses these needs. 
A high-throughput, fluorescence-based methylation assay (MethyLight™) was used herein 

25 to examine and define novel hypermethylation patterns of 19 CpG islands and one non-CpG island 
during the progression of esophageal adenocarcinoma ("EAC"). The genes were thereby 
segregated into six classes of epigenetic patterns in the various tissue types. This is the most 
comprehensive methylation survey yet performed on a system having so many distinct histological 
stages of disease progression. Furthermore, the present analysis of abnormal DNA 

30 hypermethylation offers a significant advantage over other approaches, such as gene expression 
analysis, in that it has greater sensitivity in the presence of contaminating normal cells, a common 
limiting factor. 

DNA hypermethylation, as disclosed herein, is an early epigenetic alteration in the multi- 
step progression of EAC. The premalignant intestinal metaplasia ("IM," or Barret's esophagus) is 
35 already significantly more methylated than the normal tissue (normal squamous mucosa). The 
present invention, in certain embodiments, provides the novel finding of frequent 
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hypermethylation of five additional genes in this tumor system: MYOD1, MGMT, CALCA, 
TIMP3, mdHICl. 

The methylation observed for MGMT, TIMP3, and HIC1 in normal tissues may be 
attributed to the particular region of the gene in which we analyzed methylation levels (Stoger et 

5 ah, Cell 73:61-71, 1993; Larsen et al., Hum. Mol. Genet 2:775-80, 1993; Jones, P. A., Trends 
Genet. 15:34-37, 1999). These three genes were analyzed at CpG islands located at or 
downstream of the transcription start site (TABLE 2). However, this does not account for the 
CALCA methylation we observed, because we analyzed the promoter region of this gene. Low 
levels of CALCA methylation has been previously reported in normal bone marrow samples of 

10 AML patients (Melki et al., Cancer Res. 59:3730-3740, 1999), suggesting that this locus may have 
a higher propensity to be methylated in normal tissues of cancer patients. 

It is of particular interest to note that dysplastic tissues are more frequently methylated 
than stage I tumors for both Class A(p< 0.0001) andB (p - 0.0174) (Figure 1). This is similar to 
the finding of genetic abnormalities (LOH, deletions and mutations) present in Barrett's esophagus 

15 with high grade dysplasia but not present in the adjacent invasive EAC (Barrett et al., Nat. Genet. 
22:106-109, 1999). Because stage II-IV tumors appear to be methylated at Class A genes at a 
similar frequency as dysplasia, this suggests that stage I tumors may actually evolve from a 
different origin than the dysplastic tissue and higher staged tumors, or may diverge after dysplasia 
independently from stage II-IV tumors during clonal expansion. Alternatively, but less likely, 

20 stage I tumors could undergo a transient reversal of hypermethylation. Tumor development in 
Barrett's esophagus is proposed to evolve clonally through the linear multistep pathway of 
metaplasia-dysplasia-tumor (Zhuang et al, Cancer Res. 56:1961-4, 1996). However, the 
occurrence of genetic and, according to the present invention, epigenetic alterations in a non-linear 
order, indicates that the clonal evolution of EAC is more complex than originally predicted 

25 (Barrett et al., Nat Genet. 22:106-109, 1999). A similar observation has been described for 
different stages of bladder tumors (Salem et al., Cancer Res. 60:2473-2476, 2000). 

There was, under the present analysis, no clear evidence, aside from one tumor with 10 
genes methylated, for a separate cluster of tumors with extensive concordant methylation, 
indicative of a CpG island methylator phenotype ("CIMP"). Similar results were obtained even 

30 when only "type-C" genes, as defined for CIMP (methylated in cancer, not methylated in normal 
tissues; Toyota et al, Proc. Natl. Acad. Set USA 96:8681-8686, 1999; Toyota et al., Cancer Res. 
59:5438-5442, 1999), were examined. Interestingly, the "type-C" genes in EAC differ from those 
described for colorectal cancer (Id). For example, ESR1 is classified as a "type-A" (defined as 
methylated in aging normal tissues) rather than a "type-C" gene in colorectal cancer, because it is 

35 frequently methylated in the normal colonic epithelium of aging individuals (Id). However, in 
esophageal adenocarcinoma, ESR1 clearly behaves as a "type-C" gene. This may be attributed to 
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the difference in the technology used to measure hypermethylation, or more likely may be due to 
differences in tissue types. 

According to the present invention, there is a tissue-specific and tumor-specific propensity 
for particular genes to become hypermethylated. For instance, APC is hypermethylated in normal 
5 stomach, but not in normal esophageal mucosa. The tumor-specificity of hypermethylation is 
illustrated by the lack of detectable methylation of the two Class E genes TGFBR2 and RBI, 
which are frequently hypermethylated in gastric and lung tumors, and retinoblastoma tumors, 
respectively (Stirzaker et al, Cancer Res. 57:2229-2237, 1997; Kang et al., Oncogene 18:7280- 
7286, 1999; Hougaard et al., Br. J. Cancer 79:1005-1011, 1999). 

1 0 The tumor-specificity of CpG island hypermethylation suggests that there may be tissue- 

specific trans-acting factors that modulate methylation changes of these CpG islands during 
tumorigenesis and which differ between esophageal adenocarcinomas and other tumor types. 
Alternatively, there may be a lack of selective advantage to the silencing of these genes in 
esophageal adenocarcinomas by DNA methylation. There are two scenarios in which this would 

15 be the case. One is if the gene in question has been inactivated by a different, genetic mechanism, 
rendering hypermethylation of no further selective advantage. The other is if the gene does not 
play a role in tumor suppression in this particular tumor system. 

Although alterations in DNA methylation changes are common events in tumorigenesis, 
the underlying mechanism is unclear. Abnormal methylation, at least in colorectal tumors, is not 

20 due to a mere upregulation of the DNA methyltranseferase genes, suggesting that other major 
players are involved (Eads et al., Cancer Res. 59:2302-2306, 1999). The present invention 
provides some first glimpses into the process underlying these abnormal methylation changes. 

According to the present invention, different, functionally unrelated, genes can behave in 
distinct classes with respect to their methylation changes within various tissues of EAC 

25 progression. The CpG island hypermethylation does not appear to be a random, stochastic process 
(although there is a stochastic component), but rather a step-wise process that involves multiple, 
distinct groups of alterations. This is consistent with the existence of several different 
mechanisms that protect against CpG island hypermethylation. In this scenario, the concerted 
changes seen at different CpG islands would be the result of the loss of a different type of 

30 protective element at different stages of disease progression. This finding does not appear to be 
dependent on the location of the CpG island relative to the gene, since both promoter and internal 
CpG islands were observed in all gene classes. The structural features of these CpG islands were 
also examined under the present analysis by analyzing the %GC content, the observed/expected 
CpG ratio and the CpG:GpC ratio and found no association with gene class (TABLE 2). 

35 According to the present invention, the IM or NE samples themselves, with or without 

associated dysplasia or cancer, were histologically indistinguishable, yet molecularly distinct. NE 
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and IM samples derived from individuals with concurrent distally located dysplasia or malignancy 
show a statistically higher incidence of CpG island hypermethylation. These findings were 
confirmed herein in the IM tissues in a completely independent study. This provides strong 
support for the use of epigenetic markers, particularly Class A and B genes, as disease screening 

5 tools and as predictive markers for the progression of more advanced staged disease. 

The methylation profiles of the present invention provide methods and compositions for 
the early detection of cancer. Such a molecular diagnostic approach using normal and/or 
premalignant tissues to identify patients with cancer or at elevated risk for developing cancer 
provides an opportunity for early intervention. Furthermore, a benefit of using CpG island 

10 hypermethylation as a diagnostic or prognostic marker is that it can easily be detected in a field of 
normal cell contamination as a gain of signal, unlike loss of gene expression (e.g., LOH and 
deletion analysis), which is difficult to resolve in a sample with contaminating normal cells. 



SUMMARY 

15 According to the present invention, the 19 CpG islands (TABLES I and II) studied 

segregate into six classes of epigenetic patterns in the various tissue types. Each class undergoes 
unique epigenetic changes at different steps of disease progression of EAC. The methylation 
profiles provide methods and compositions for the early detection of cancer. 
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